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Method of, and Computer Software for, Classification of Cells 

into Subpopulations 

5 Field of the Invention 

The invention relates to methods of cell classification. Cells are imaged 
and classified into subpopulations. The invention further relates to apparatus 
and computer software adapted to carry out such a method. 

1 0 Background of the Invention 

There is currently a need in drug discovery and development and in 
general biological research for methods and apparatus for accurately performing 
cell-based assays. Cell-based assays are advantageously employed for assessing 
the biological activity of chemical compounds, 

15 In addition, there is a need to quickly and inexpensively screen large 

numbers of chemical compounds. This need has arisen in the pharmaceutical 
industry where it is common to test chemical compounds for activity against a 
variety of biochemical targets, for example, receptors, enzymes and nucleic 
acids. These chemical compounds are collected in large libraries, sometimes 

20 exceeding one million distinct compounds. The use of the term chemical 
compound is intended to be interpreted broadly so as to include, but not be 
limited to, simple organic and inorganic molecules, proteins, peptides, nucleic 
acids and oligonucleotides, carbohydrates, lipids, or any chemical structure of 
biological interest. 

25 In the field of compound screening, cell-based assays are run on 

populations of cells. The measured response is usually an average over the cell 
population. For example, a popular instrument used for ion channel assays is 
disclosed in U.S. Patent No. 5,355,215. A typical assay consists of measuring 
the time-dependence of the fluorescence of an ion-sensitive dye, the 

30 fluorescence being a measure of the intra-cellular concentration of the ion of 
interest which changes as a consequence of the addition of a chemical 
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compound. The dye is loaded into the population of cells disposed on the 
bottom of the well of a multiwell plate at a time prior to the measurement. 

In general, the response of the cells is heterogeneous in both magnitude 
and tl me. This variability may obscure or prevent the observation of biological 
5 activity important to compound screening. The heterogeneity may arise from 
experimental sources, but more importantly, heterogeneity is fundamental in any 
population of cells. Among others, the origin of the variability may be a 
consequence of the life-cycle divergence among the population, or the result of 
the evolutionary divergence of the number of active target molecules. A method 
' that negates, compensates for, or even utilizes the variations would enhance 
the value of cell-based assays in the characterization of the pharmacological 
activity of chemical compounds. 

Quantification of the response of individual cells circumvents the 
problems posed by the non-uniformity of that response of a population of cells 
Conner the case where a minor fraction of the population responds to the 
sumulus. A device that measures the average response will have less sensitivity 
than one determining individual cellular response. However, analysis of the 
responses of individual cells will be time-consuming in the case of populations 
of large cell count. 

The cell cycle is of key importance to many areas of drug discovery On 
the one hand this fundamental process provides the opportunity to discover new 
targets for anticancer agents and improved chemotherapeutics, but on the other 
hand drugs and targets in other therapeutic areas must be tested for undesirable 
effects on the cell cycle. Historically, a wide range of techniques have been 
developed to study the cell cycle both as a global biochemical process and at the 
molecular level. 

Known methods include those that produce data describing the 
proliferative activity of a cell population. 

Measuring the incorporation of [" C ]- or [ 3 H]-thymidine (Regan JD 
and Chu, E.H. (1966) «A convenient method for assay of DNA synthesis* 
synchronized human cell cultures" J. Cell Biol. 28, 139-143) by scintillation 
counting was one of the earliest methods of determining cell proliferation, and is 



still widely used today. More recent developments (Graves, R. et al. (1997) 
"Noninvasive, real-time method for the examination of thymidine uptake events 
- application of the method to V-79 cell synchrony studies" Anal. Biochem. 248, 
251-257) have allowed thymidine incorporation to be measured in a 
5 homogeneous microplate assay format. 

Several non-radioactive alternatives to thymidine incorporation assays 
have been developed. These include enzyme-linked immunosorbent assay 
(ELISA) nucleotide bromo-deoxyuridine (BrdU) (Perros, P. and Weightman, 
D.R. (1991) "Measurement of cell proliferation by enzyme-linked 
10 immunosorbent assay (ELISA) using a monoclonal antibody to 
bromodeoxyuridine. Cell. Prolif. 24, 517-523; Wemme, H. et al. (1992) 
"Measurement of lymphocyte proliferation: critical analysis of radioactive and 
photometric methods" Immunobiology 185, 78-89) into replicating DNA, and 
staining of proliferation-specific antigens such as Ki-67 (Frahm, S.O. et al 
15 (1998) "Improved ELISA proliferation assay (EPA) for the detection of in vitro 
cell proliferation by a new Ki-67-antigen directed monoclonal antibody (Ki-S3)" 
J. Immunol. Methods 211, 43-50). 

Colourimetric methods based on substrate conversion (Mosmann, T. 
(1983) "Rapid colourimetric assay for cellular growth and survival: application 
20 to proliferation and cytotoxicity assays" J. Immunol. Methods 65, 55-63; 
Roehm, N.W. et al. (1991) "An improved colourimetric assay for cell 
proliferation and viability utilizing the tetrazolium sal XTT" J. Immunol. 
Methods 142, 257-265) by mitochondrial and other cellular enzymes are also 
used to measure cell growth. Although these assays are often referred to as cell- 
25 proliferation assays, strictly speaking they are cell-mass assays. Unlike 
measuring thymidine or BrdU incorporation, these assays do not provide any 
inherent measure of cell cycle progression, and give only a measure of cell mass 
ie. increase in cell number, relative to another population. 

Other methods for measuring cell proliferation (i.e. increasing cell 
30 numbers) have been reported based on measuring electrical impedance 
(Upadhyay, P. and Bhaskar,. S. (2000) "Real time monitoring of lymphocyte 
proliferation by an impedance method" J. Immunol. Methods 244, 133-137), 
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dissolved oxygen (Wodnicka, M. et al (2000) "Novel fluorescent technology 
platform for high throughput cytotoxicity and proliferation assays" J. Biomol 
Screen: 5, 141-152) and others. However, as for the colourimetric assays 
d 1S cussed above, these do not directly report cell cycle parameters and have not 
been widely adopted. 

All of the above methods provide data on the overall proliferation within 
a cell population under examination, but do not identify the status of individual 
cells. Adaptation of these assays to imaging, for example by micro- 
autoradiography of [ 3 H]- or [ ]4 C]-thymidine incorporation (Dormer, P. (1981) 
"Quantitative carbon-14 autoradiography at the cellular level: principles and 
application for cell kinetic studies" Histochem. J. 13, 161-171) or by 
immunochemical or immunofluorescence detection of BrdU (Dolbeare, F 
(1995) "Bromodeoxyuridine: a diagnostic tool in biology and medicine, Part P 
historical perspectives, histochemical methods and cell kinetics" Histochem. J 
27, 339-369) permits identification of cells that have traversed S phase, but does 
not yield information on the cell cycle position of other cells under analysis. 

To determine the cell cycle status of all cells in a population it is a 
prerequisite that the analytical technique can resolve at least to the level of a 
single cell. Of the two qualifying techniques available, flow cytometry and 
introscopy, flow cytometry has become firmly established as the standard 
method for analysing cell cycle distribution. 

The DNA content of cell nuclei varies through the cell cycle in a 
predictable fashion - cells in G2 or M have twice the DNA content of cells in 
Gl, and cells undergoing DNA synthesis in S phase have an intermediate 
amount of DNA. Consequently, staining of cellular DNA with propidium iodide 
(Nairn, R.C. and Rolland, J.M. (1980) "Fluorescent probes to detect lymphocyte 
activation" Clin. Exp. Immunol. 39, 1-13) or other fluorescent dyes (Smith P J 
et al (2000) "Characteristics of a novel deep red/infrared fluorescent cell- 
permeant DNA probe, DRAQ5, in intact human cells analyzed by flow 
cytometry, confocal and multiphoton microscopy" Cytometry 40, 280-291) that 
are compatible with live cells, followed by flow cytometry permits measurement 
of the relative proportion of cells in Gl, S and G2/M phases. However, analysis 



by propidium iodide staining and flow cytometry is necessarily destructive and 
hence requires multiple samples to study cell cycle progression, which can 
become rate limiting where many hundreds of samples are to be analysed. In 
addition, flow cytometry does not yield fine resolution of cell cycle position in 
5 G2/M as the DNA content is the same in all cells. 

A combination of DNA staining with pulsed BrdU incorporation can be 
used to resolve the cell cycle position further (Dolbeare, F. et al. (1983) "Flow 
cytometric measurement of total DNA content and incorporated 
bromodeoxyuridine" Proc. Natl. Acad. Sci. U.S.A. 80, 5573-5577). Dual- 
10 parameter analysis of DNA staining and/or BrdU incorporation can also be used 
with antibodies to cell-surface markers to profile cell cycle distribution in a 
defined subpopulation of cells (Mehta, B.A. and Maino, V.C. (1997) 
"Simultaneous detection of DNA synthesis and cytokine production in 
staphylococcal enterotoxin B activated CD4 + T lymphocytes by flow cytometry" 
15 J. Immunol. Methods 208, 49-59; see also Johannisson, A. et al. (1995) 
"Activation markers and cell proliferation as , indicators of toxicity: a flow 
cytometric approach" Cell Biol. Toxicol. 11, 355-366; see also Penit, C. and 
Vasseur, F. (1993) "Phenotype analysis of cycling and postcycling thymocytes: 
evaluation of detection methods for BrdUrd and surface proteins" Cytometry 14, 
20 757-763). 

Although to date flow cytometry has remained the dominant method for 
analysing the cell cycle, many of the above techniques have also been applied to 
microscopic analyses (Gorczyca, W. et al. (1996) "Laser scanning cytometer 
(LSC) analysis of fraction of labeled mitoses (FLM)" Cell Prolif. 29, 539-547; 
25 Clatch, R.J. and Foreman, J.R. (1998) "Five-colour immunophenotyping plus 
DNA content, analysis by laser scanning cytometry" Cytometry 34, 36-38). 

The techniques described above all provide information in various forms 
from a single point in time (e.g. propidium iodide staining for DNA content) or 
integrated over a period of time (e.g. thymidine or BrdU incorporation). One 
. 30 further technique, cell-division tracking (Nordon, R.E. et al. (1999) "Analysis of 
growth kinetics by division tracking" Immunol. Cell Biol. 77, 523-529; Lyons, 
A.B. (1999) "Divided we stand: tracking cell proliferation with 
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carboxyfluorescein diacetate succinimidyl ester" Immunol. Cell. Biol. 77, 509- 
515), allows the replicative history of a cell population to be analysed. In this 
method cells are loaded with a fluorescent dye such as carboxy-fluorescein 
diacetate succinimidyl ester (CFSE), which is partitioned between daughter cells 
at each successive round of cell division with a twofold reduction in 
fluorescence. Subsequent analysis of cell fluorescence by flow cytometry 
reveals the number of cell divisions undergone by each cell in the population. 
This technique has also been used in multi-parameter analyses combined with 
BrdU and proliferation-marker staining (Hasbold, J. and Hodgkin, P.D. (2000) 
"Flow cytometric cell division tracking using nuclei" Cytometry 40, 230-237). 

International patent application WO 01/1 1341 describes a method for the 
automated measurement of the mitotic index of cells using fluorescence 
imaging. The technique involves immunoflourescence which reports 
specifically on mitotic cells by signals emitted from the cell nuclei, dependent 
upon the phosphorylation of histone H3. A mitotic index is determined by 
detecting the number of mitotic cells compared with the number of nuclei 
detected in a separate fluorescence channel. The technique involves simply 
counting cells having a signal above a given threshold, and is unsuited for the 
detection of cell cycle phases other than mitosis. Furthermore, the signal 
thresholds have to be predetermined, or entered by an operator. 

The application of GFP and imaging techniques to cell cycle analysis has 
enabled significant advances to be made in understanding the timing of the 
molecular events that control the cell cycle. Fusing GFP with key cell-cycle- 
control proteins has provided significant insights into the molecular organisation 
behind the cell cycle (see (Raff, J.W. et al (2002) "The roles of Fzy/Cdc20 and 
Fzr/Cdhl in regulating the destruction of cyclin B in space and time" J. Cell 
Biol. 157, 1139-1149; Zeng, Y. et al. (2000) "Minimal requirements for the 
nuclear localization of p27(Kipl), a cyclin-dependent kinase inhibitor" 
Biochem. Biophys. Res. Commun. 274, 37-42; Huang, J. and Raff. J.W. (1999) 
"The disappearance of cyclin B at the end of mitosis is regulated spatially in 
Drosophila cells" EMBO J. 18, 2184-2195; Weingartner, M. et al. (2001) 
"Dynamic recruitment of Cdc2 to specific microtubule structures during 



mitosis" Plant Cell 13, 1929-1943; Arnaud, L. et al. (1998) "GFP tagging 
reveals human Polo-like kinase 1 at the kinetochore/centromere region of 
mitotic chromosomes" Chromosoma 107, 424-429) and other cellular 
components (Kanda, T. et al. (1998) "Histone-GFP fusion protein enables 

5 sensitive analysis of chromosome dynamics in living mammalian cells" Curr. 
Biol. 8, 377-385; Reits, E.A. et al. (1997) "Dynamics of proteasome distribution 
in living cells" EMBO J. 16, 6087-6094; Tatebe, H. et al. (2001) "Fission yeast 
living mitosis represented by GFP-tagged gene products" Micron 32, 67-74)). 
However, although these specialised approaches provide valuable data on the 

10 mechanisms and components involved, they are not generic methods for 
monitoring the cell cycle. 

Summary of the Invention 

In accordance with one aspect of the present invention, there is provided 
15 a method of classifying cells into subpopulations using cell classifying data, the 
method comprising: receiving image data; analyzing said image data to identify 
object areas in the image data; analyzing said image data, on the basis of said 
identified object areas, to determine, for at least one selected first cell, one or 
more measurements; deriving a first parameter set for the first cell, the first 
20 parameter set comprising at least one of said one or more measurements; 
classifying a first set of cells, the process of classifying the first set of cells 
including classifying the first cell into a subpopulation and storing first 
identifying data indicating the subpopulation into which the first cell has been 
classified; deriving cell classifying data for use in classifying a second set cells 
25 into subpopulations from the first parameter set and the first identifying data, 
and classifying a second set of cells into subpopulations on the basis of one or 
more measurements taken for cells in the second set of cells, by use of the cell 
classifying data. 

The present invention provides a cell classification method that 'learns' 
30 from previous classifications. The process of learning to classify by the analysis 
of data relating to previously-classified examples may be by means of a process 
termed 'supervised learning', and as such the present invention provides a 
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robust method of supervised learning for the purposes of cellular analysis. Cell 
classifying data is derived from a parameter set and associated identifying data. 

The parameter set includes at least one measurement relating to a cell. 
The object area may relate correspond to an entire cell, an area corresponding to 
or within the nucleus, an area corresponding to or within the cytoplasm, or other 
object areas corresponding to or within subcellular components. Examples of 
measurements include: 

an average signal intensity within an identified object area; 

a fraction of pixels that deviate more than a given amount from an 
average signal intensity within an identified object area; 

a number of pixels with a signal intensity below a given threshold within 
an identified object area; 

a ratio between major and minor axes of an elliptical outline 
corresponding to an identified object area; 

a maximum width of an identified object area; 
an average width of an identified object area; 
signal texture within an identified object area; 
margination in an identified object area. 

In an embodiment, the measurement(s) may be calculated automatically 
using a set of image analysis routines. The measurements for each object area 
may then be stored in memory in association with identification data, to build up 
a database of classifying data which can later be applied with minimal user 
intervention to further sets of cells. High-throughput automated cell 
classification can thereby be achieved. 

A method according to the present invention may derive cell classifying 
data based on parameter sets including any measurement determined from the 
image data. In this way, cell classifying data may be derived from image data 
that includes but is not limited to the luminescence of reporter genes. A 
parameter set may be derived from the one or more measurements taken for the 
second set of cells. 

• A method according to the present invention may be used to classify 
cells into subpopulations according to cell morphology. For example, the 



identifying data may be neurite formation/outgrowth or may classify the cell 

according to other criteria. 

A method according to the present invention may be used to classify 
cells into subpopulations according to receptor binding. For example, the 
identifying data may be granule/vesicle formation or colour change (e.g. in the 
presence of specific dyes such as CypHer™5 from Amersham Biosciences) or 
may classify the cell according to other criteria. 

A method according to the present invention may be used to classify 
cells into subpopulations according to cell cycle phase. The identifying data 
may be a cell cycle phase classification (e.g. 'prophase', 'metaphase', 
'anaphase', 'telophase', 'G2', 'S', 'G2') or may classify the cell according to 
other criteria. 

A method according to the present invention may take any measurement 
of the second set of cells from the image data without user intervention and as 
such will not require an operator to input any threshold or specify any 
measurement value relating to the second set of cells. The cell classifying data 
derived from the identifying data and the parameter set will therefore be derived 
from objective and accurate measurement data, facilitating accurate 
classification of further sets of cells. 

Use of the cell classifying data to classify a second set of cells may 
include comparing the measurements for cells in the second set with the cell 
cycle classifying data derived from classification of the first set of cells. For 
example, if a cell in the first set is classified as being in prophase, and the 
parameter set for that cell includes a measurement of reporter luminescence 
having a value x, a cell in the second set which is determined to also have a 
reporter luminescence value sufficiently similar to x may be classified as also 
being in prophase. In this way, the derivation of the cell classifying data and the 
application of the data to a second set of cells allows automated classification of 

the second set of cells. 

The use of the cell classifying data to classify a second set of cells may 
include calculating a statistical likelihood for each cell in second set of being a 
member of a classified group. For example, the value of a measurement taken 
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for a cell in the second set may be compared with the analogous measurement in 
the parameter sets of classified cells and, if no exact match of the value is found, 
the nearest match is calculated, and the cell in the second set classified 
according to the nearest match. Several measurements may be taken and 
weighted in statistical importance when compared with the parameter sets of 
classified cells. 

In embodiments where n measurements are taken from the cell image 
data, the parameter set may be represented as a feature vector, in an n- 
dimensional feature space. The representation of the parameter set as a feature 
vector in a feature space allows a number of classification techniques to be 
employed, and is described in more detail below. 

Further features and advantages of the invention will become apparent 
from the following description of preferred embodiments of the invention, given 
by way of example only, which is made with reference to the accompanying 
drawings. 

Brief Description of the Drawings 

Figure l is a flow diagram showing a method of cell classification 
according to an embodiment of the invention. 

- Figure 2 is a- schematic view of a first embodiment of a line-scan 
confocal microscope used to image samples according to the present invention. 

Figures 3A and 3B are, respectively, a top view and a side view of the 
ray path of a multicolour embodiment of the present invention, without a 
scanning mirror. Figure 3C is a top view of the ray path of a single beam 
autofocus. 

Figures 4A and 4B are, respectively, a top view and a side view of the 
ray path of the multicolour embodiment of the present invention with the 
scanning mirror. , Figure 4C is a top view of the ray path of the single beam 
autofocus. 

Figure 5 is a side view of the two beam autofocus system. 
Figures 6A, 6B and 6C illustrate a rectangular CCD camera and readout 
register. 
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Figure 7 is a schematic illustration showing data processing components 
in an imaging a data processing system arranged in accordance with an 
embodiment of the invention. 

Figure 8 is a schematic diagram illustrating cell cycle position nucleic 
acid reporter constructs used in an embodiment of the present invention. 

Figure 9 shows a DNA construct for determining the G2/M phase of the 
cell cycle. 

Figure 10 is a schematic diagram illustrating cyclin Bl regulation during 
cell cycle progression. The cell cycle proceeds in the direction of the arrow with 
cyclin Bl expression driven by a cell cycle phase-specific promoter which 
initiates expression at the end of the S phase and peaks during G2 (A). At the 
start of mitosis (B) cyclin Bl translocates from the cytoplasm to the nucleus and 
from metaphase onwards (C) the protein is specifically degraded. 

Figure 11 is a schematic illustration showing typical intensity and 
distribution of signals in a cell including a fluorescent reporter in accordance 
with an embodiment of the invention, in each of the G0/G1/S, G2, prophase and 
mitosis (M) cell cycle phases. 

Figure 12 is a schematic illustration showing typical intensity and 
distribution of signals in a cell including a fluorescent reporter in accordance 
with an embodiment of the invention, in each of the metaphase, anaphase, 
telophase, and cytokinesis cell cycle phases. 

Figure 13 is a representation of a parameter set as a feature vector in a 3 
dimensional feature space. 

Figure 14 is a representation of feature vector in a 2 dimensional space, 
with clusters of pre-classified feature vectors. 

Detailed Description of the Invention 

. The present invention is useful for identifying pharmacological agents 
for the treatment of disease. It provides a potentially automated, high 
throughput method for conducting a wide variety of biological assays where one 
or more markers, including luminescent markers, are employed to measure a 
biological response. Several markers may be used in conjunction to derive a 
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variety of measurements, and the measurements may be determined 
automatically to ensure accuracy. Such assays can be conducted on chemical 
compounds or any molecule of biological interest, including but not limited to 
drug candidates, such as those found in combinatorial libraries, allowing high 
throughput screening of chemical compounds of biological interest. 

The techniques of the present invention may be used in assays in which 
data are acquired on individual cells, on a cellular or sub-cellular level, 
sufficiently rapidly so as to permit the acquisition of such data on a sufficient 
number of cells to constitute a statistically meaningful sample of the cell 
population. 

These assays may make use of any known fluorophore or fluorescent 
label including but not limited to fluorescein, rhodamine, Texas Red, Amersham 
Corp. stains Cy3, Cy5, Cy5.5 and Cy7, Hoechst's nuclear stains and Coumarin 
stains. (See Haugland, R.P., Handbook of Fluorescent Probes and Research 
Chemicals 6 th Ed., 1996, Molecular Probes, Inc., Eugene, Oregon.) 

Figure 1 is a flow diagram illustrating an embodiment of the invention. 
Image data from a first cell population is received by an imaging device which 
incorporates a data processing system. The image data is analysed by the data 
processing system in step 1 to derive object areas which correspond to cells and 
the areas within cells, specifically the nucleus and cytoplasm. 

In step 2, measurements are taken from the image data. These 
measurements may relate to the intensity, morphology and dimensions of the 
cells represented in the image data. The measurements are stored in the memory 
of the data processing system. 

At step 3, the measurements for each cell identified from the image data 
are grouped together in a parameter set, stored in the memory of the data 
processing system. Each cell identified in the image data is assigned its own 
parameter set. 

At step 4, a user classifies cells represented in the jmage data. This'may 
be done via a known graphical user interface attached to the data processing 
system. The data resulting from the user classifying cells is received by the data 
processing system at step 5. The identifying data for each cell is saved in 
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association with that cell's parameter set in the memory of the data processing 
system in step 6, to derive classifying data at step 7. The identifying data allots 
each cell to a subpopulation based on features in the cell image data. 

In step 8, the classifying data is applied to a second set of cells. The 
second set of cells is analysed by the imaging device, divided into object areas, 
and measurements are taken in a manner similar to steps 1 and 2. The 
measurements are then analysed and the second set of cells are divided into 
subpopulations, on the basis of the measurements taken for cells in the second 
set of cells, by use of the cell classifying data. 

A detailed description of the steps shown in Figure 1 follows. 
Figure 2 shows a first embodiment of the present invention, where the 
imaging device used is a microscope. The microscope comprises a source 100 
or 110 of electromagnetic radiation for example, in the optical range, 350- 
750nm, a cylindrical lens 120, a first slit mask 130, a first relay lens 140, a 
dichroic mirror 150, an objective lens 170, a microliter plate 180 containing a 
two-dimensional array of sample wells 182, a tube lens 190, a filter 200, a 
second slit mask 210 and a detector 220. These elements are arranged along 
optical axis OA with slit apertures 132, 212 in masks 130, 210 extending 
perpendicular to the plane of Figure 2. The focal lengths of lenses 140, 170 and 
190 and the spacings between these lenses as well as the spacings between mask 
130 and lens 140, between objective lens 170 and microtiter plate 180 and 
between lens 190 and mask 210 are such as to provide a confocal microscope. 
In this embodiment, electromagnetic radiation from a lamp 100 or a laser 1 10 is 
focused to a line using a cylindrical lens 120. The shape of the line is optimized 
by a first slit mask 130. The slit mask 130 is depicted in an image plane of the 
optical system that is in a plane conjugate to the object plane. The illumination 
stripe formed by the aperture 132 in the slit mask 130 is relayed by lens 140, 
dichroic mirror 150 and objective lens 170 onto a microtiter plate 180 which 
contains a two-dimensional array of sample wells 182. For convenience of 
illustration, the optical elements of Figure 2 are depicted in cross-section and the 
well plate in perspective. The projection of the line of illumination onto well 
plate 180 is depicted by line 184 and is also understood to be perpendicular to 



the plane of Figure 2. As indicated by arrows A and B, well plate 180 may be 
moved in two dimensions (X, Y) parallel to the dimensions of the array by 
means not shown. 

In an alternative embodiment, the slit mask 130 resides in a Fourier 
5 plane of the optical system that is in a plane conjugate to the objective back 
focal plane (BFP) 160. In this case the aperture 132 lies in the plane of the 
~ figure, the lens 140 relays the illumination stripe formed by the aperture 132 
onto the back focal plane 160 of the objective 170 which transforms it into a line 
184 in the object plane perpendicular to the plane of Figure 2. 

10 In an additional alternative embodiment the slit mask 130 is removed 

entirely. According to this embodiment, the illumination source is the laser 110, 
the light from which is focused into the back focal plane 160 of the objective 
170. This can be accomplished by the combination of the cylindrical lens 120 
and the spherical lens 140 as shown in Figure 2, or the illumination can be 

15 focused directly into the plane 160 by the cylindrical lens 120. 

An image of the sample area, for example a sample in a sample well 1 82, 
is obtained by projecting the line of illumination onto a plane within the sample, 
imaging the fluorescence emission therefrom onto a detector 220 and moving 
the plate 180 in a direction perpendicular to the line of illumination, 

20 synchronously with the reading of the detector 220. In the embodiment depicted 
in Figure 2, the fluorescence emission is collected by the objective lens 170, 
projected through the dichroic beamsplitter 150, and imaged by lens 190 
through filters 200 and a second slit mask 210 onto a detector 220,. such as is 
appropriate to a confocal imaging system having an infinity-corrected objective 

25 lens 170. The dichroic beamsplitter 150 and filter 200 preferentially block light 
at the illumination wavelength. The detector 220 illustratively is a camera and 
may be either one dimensional or two dimensional. If a one dimensional 
detector is used, slit mask 210 is not needed. The illumination, detection and 
translation procedures are continued until the prescribed area has been imaged. 

30 Mechanical motion is simplified if the sample is translated at a continuous rate. 
Continuous motion is most useful if the camera read-time is small compared to 
the exposure-time. In a preferred embodiment, the camera is read continuously. 
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The displacement d of the sample during the combined exposure-time and read- 
time may be greater than or less than the width of the illumination line W, 
exemplarily 0.5W <d <5W. All of the wells of a multiwell plate can be imaged 
in a similar manner. 

5 Alternatively, the microscope can be configured to focus a line of 

illumination across a number of adjacent wells, limited primarily by the field-of- 
view of the optical system. Finally, more than one microscope can be used 
simultaneously. 

The size and shape of the illumination stripe 184 is determined by the 
10 width and length of the Fourier transform stripe in the objective lens back focal 
plane 160. For example, the length of the line 184 is determined by the width of 
the line in 160 and conversely the width in 184 is determined by the length in 
160. For diffraction-limited performance, the length of the illumination stripe at 
160 is chosen to overfill the objective back aperture. It will be evident to one 
15 skilled in the art that the size and shape of the illumination stripe 184 can be 
controlled by the combination of the focal length of the cylindrical lens 120 and 
the beam size at 120, that is by the effective numerical aperture in each 
dimension, within the restrictions imposed by aberrations in the objective, and 
- . the objective field of view. 

20 The dimensions of the line of illumination 184 are chosen to optimize 

the signal to noise ratio. Consequently, they are sample dependent. Depending 
on the assay, the resolution may be varied between diffraction-limited, i.e., less 
than 0.5 /im, and approximately 5 /xm. The beam length is preferably 
determined by the objective field of view, exemplarily between 0.5 and 1.5 mm. 

25 A Nikon ELWD, 0.6 NA, 10X objective, for example, has a field of view of 
approximately 0.75 mm. The diffraction-limited resolution for 633 nm radiation 
with this objective is approximately 0.6 £im or approximately 1100 resolution 
. elements. 

The effective depth resolution is determined principally by the width of 
30 aperture 212 in slit mask 210 or the width of the one dimensional detector and 
the image magnification created by the combination of the objective lens 170 
and lens 190. The best depth resolution of a confocal microscope approaches 1 
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/im. In the present application, a depth resolution of 5-10 ytm may be sufficient 
or even advantageous. 

For example, when the sample of interest, such as a live cell, contains 
insufficient fluorophores in a diffraction-limited volume to permit an adequate 
5 signal-to-noise image in a sufficiently brief image-acquisition time, it is 
advantageous to illuminate and collect the emission from a larger than 
diffraction-limited volume. A similar situation prevails in the case of video-rate 
kinetics studies of transient events such as ion-channel openings. Practically, 
this is accomplished by underfilling the back aperture of the objective lens, 

10 which is equivalent to increasing the diameter of the illumination aperture. The 
effective numerical aperture ("NA") of the illumination is less than the NA of 
the objective. The fluorescence emission is, however, collected with the full NA 
of the objective lens. The width of aperture 212 must be increased so as to 
detect emission from the larger illumination volume. At an aperture width a few 

15 times larger than the diffraction limit, geometrical optics provides an adequate 
approximation for the size of the detection-volume element: 
Lateral Width: ad = d d /M, 
Axial Width: z d = ^Ad^/ano; 
where M is the magnification, da is the width of aperture 212 and a is the half- 

20 angle subtended by the objective 170. It is an important part of the present 
invention that the illumination aperture 132 or its equivalent in the embodiment 
having no aperture and the detection aperture 212 be independently controllable. 

Multi- Wavelength Configuration 
25 An embodiment enabling multi -wavelength fluorescence imaging is 

preferred for certain types of assays. In this way, image data can be generated 
for the same area being imaged in each of a plurality of different colour channels 
simultaneously. 

The number of independent wavelengths or colours will depend on the 
30 specific assay being performed. In one embodiment three illumination 
wavelengths are used. Figs. 3A and 3B depict the ray paths in a three-colour 
line-scan confocal imaging system, from a top view and a side view 
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respectively. In general, the system comprises several sources S„ of 
electromagnetic radiation, collimating lenses U, and mirrors M„ for producing a 
collimated beam that is focused by cylindrical lines CL into an elongated beam 
at first spatial filter SF U a confocal microscope between first spatial filter SF U 
and second spatial filter SF 2 and an imaging lens IL, beamsplitters DM, and 
DM 2 and detectors D n for separating and detecting the different wavelength 
components of fluorescent radiation from the sample. Spatial filters SF, and SF, 
and SF 2 preferably are slit masks. 

In particular, Figure 3 A depicts sources, Si, S 2 and S 3 , for colours X,, X 2 
and X 3> and lenses L,, L 2 and L 3 that collimate the light from the respective 
sources. Lenses Li, L 2 and L 3 , preferably are adjusted to compensate for any 
chromaticity of the other lenses in the system. Mirrors Mi, M 2 and M 3 are used 
to combine the illumination colours from sources S„. The mirrprs M 2 and Mi 
are partially transmitting, partially reflecting and preferentially dichroic. M 2 , for 
example, should preferentially transmit X 3 , and preferentially reflect X 2 . It is thus 
preferential that X 3 be greater than X 2 . 

Operation of the microscope in a confocal mode requires that the 
combined excitation beams from sources S n be focused to a "line", or an highly 
eccentric ellipse, in the object plane OP. As discussed in connection to Figure 2 
above, a variety of configurations may be used to accomplish this. In the 
embodiment depicted in Figure 3A, the combined illumination beams are 
focused by cylindrical lens CL into an elongated ellipse that is coincident with 
the slit in the spatial filter SFi. As drawn in Figs. 3 A and 3B, the slit mask SF, 
resides in an image plane of the system, aligned perpendicular to the 
propagation of the illumination light and with its long axis in the plane of the 
page of Figure 3A. The lenses TL and OL relay the illumination line from the 
plane containing SF, to the object plane OP. A turning mirror, TM, is for 
convenience. In another embodiment, DM 3 is between TL and OL and CL 
focuses the illumination light directly into the BFP. Other embodiments will be 
evident to one skilled in the art. 

Referring to Figure 3B, the light emitted by the sample and collected by 
the objective lens, OL, is imaged by the tube lens, TL, onto the spatial filter, 
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SF 2 . SF 2 is preferentially a slit aligned so as to extend perpendicular to the plane 
of the page. Thus, the light passed by filter SF 2 is substantially a line of 
illumination. SF 2 may be placed in the primary image plane or any plane 
conjugate thereto. DM 3 is partially reflecting, partially transmitting and 
preferably "multichroic". Multi-wavelength "dichroic" mirrors, or "multichroic" 
mirrors can be obtained that preferentially reflect certain wavelength bands and 
preferentially transmit others. 

Here, 5X, will be defined to be the fluorescence emission excited by X, 
This will, in general, be a distribution of wavelengths somewhat longer than X, 
8\ 2 and 5X 3 are defined analogously. DM 3 preferentially reflects K and 
preferentially transmits 5X,, n=l,2,3. The light transmitted by SF 2 is imaged 
onto the detection devices, which reside in planes conjugate to the primary 
image plane. In Figure 3A, an image of the spatial filter SF 2 is created by lens 
IL on all three detectors, D n . This embodiment is preferred in applications 
requmng near-perfect registry between the images generated by the respective 
detectors, m another embodiment, individual lenses IL, are associated with the 
detection devices, the lens pairs IL and IL. serving to relay the image of the 
spatial filter SF 2 onto the respective detectors D, The light is split among the 
detectors by mirrors DM, and DM, The mirrors are partially transmitting 
partially reflecting, and preferentially dichroic. DM, preferentially reflects 5X, 
and preferentially transmits 6X 2 and 5 X, The blocking filter, BF„ preferentially 
transmits 5X, effectively blocking all other wavelengths present DM 2 
preferentially reflects 6X 2 and preferentially transmits 6X, The blocking filters 
BF 2 and BF 3) preferentially transmit 6X 2 and 5X 3 respectively, effectively 
J blocking all other wavelengths present. 

Scanning M irror Confitn iratinn 

In some embodiments of this invention, rapid data acquisition is 
provided by framing images at video rates. Video-rate imaging allows up to 30 
or even 60 frames per second. In the present use, it is intended to connote frame 
rates with an order-of-magnitude of 30 Hz. In a preferred embodiment, video- 
rate imaging is achieved by illuminating along one dimension of the sample 
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plane and scanning the illumination beam in the direction perpendicular thereto 
so as to effect a relative translation of the illumination, and sample. The 
scanning stage is generally massive and so cannot be moved sufficiently rapidly. 

Figures 4A, 4B and 4C depict an embodiment of the invention utilizing a 
scanning mirror, SM. The mirror is advantageously placed in a plane conjugate 
to the objective back focal plane (BFP): A rotation in the BFP (or a plane 
conjugate thereto) effects a translation in the object plane (OP) and its conjugate 
planes. The full scan range of SM need only be a few degrees for typical values 
of the focal lengths of the lenses RLi arid RL 2 . As shown in Figures 4, 4B and 
4C, this lens pair images the BFP onto the SM at a magnification of one, but a 
variety of magnifications can be advantageously used. The limiting factors to 
the image acquisition rate are the camera read-rate and the signal strength: In 
the imaging mode described above, data can be acquired continuously at the 
camera read-rate, exemplarily 1 MHz. With a scanning mirror, it is preferable to 
acquire data uni-directionally. The idealized scanning motion allowing one to 
acquire data continuously is the sawtooth. In practice, the combination of turn- 
around and return scan times will constitute -1/3-2/3 of the scan period. 
Assuming 50% dead-time, a mirror oscillation frequency of 50 Hz and a pixel 
acquisition rate of 1 MHz, -10,000 pixels would be acquired per frame at 50 
frames per second, which is sufficient to define and track individual objects, 
such as cells, from frame to frame. 10 4 pixels per image is, however, 10 2 -times 
fewer than was generally considered above. Depending on the application, it is 
advantageous to acquire relatively smaller images at high resolution, e.g. 50-/im 
X 50-nm at 0.5-fim X 0.5-^m pixelation, or relatively larger images at lower 
resolution, e.g. 200-/«n X 200-pim at 2-/im pixelation. 

Autofocus 

In preferred embodiments of the present invention, the sample lies in the 
object plane of an imaging system. Accordingly, an autofocus mechanism is 
used that maintains the portion of the sample in the field-of-view of the imaging 
system within the object plane of that system. The precision of planarity is 
determined by the depth-of-field of the system. In a preferred embodiment, the 
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depth-of-field is approximately 10 fxm and the field-of-view is approximately 1 
mm 2 . 

The autofocus system operates with negligible delay, that is, the 
response time is short relative to the image acquisition-time, exemplarily 0.01- 
0.1 s. In addition, the autofocus light source is independent of the illumination 
light sources and the sample properties. Among other advantages, this 
configuration permits the position of the sample carrier along the optical axis of 
the imaging system to be determined independent of the position of the object 
plane. 

Embodiments of single-beam autofocus are shown in Figure 4C, where a 
separate light source, S 4 of wavelength X4, and detector D 4 are shown. The 
wavelength X4 is necessarily distinct from the sample fluorescence, and 
preferentially a wavelength that cannot excite appreciable fluorescence in the 
sample. Thus, \* is preferentially in the near infrared, exemplarily 800-1000 
nm. The partially transmitting, partially reflecting mirror, DM4, is preferentially 
dichroic, reflecting X4 and transmitting \ and 5)^, n-1,2,3. Optically-based 
autofocus mechanisms suitable for the present application are known. For 
example, an astigmatic-lens-based system for the generation of a position error 
signal suitable for servo control is disclosed in Applied Optics 23 565-570 
(1984). A focus error detection system utilizing a "skew beam" is disclosed in 
SPIE 200 73-78 (1979). The latter approach is readily implemented according 
to Figs. 3C and 3C, where D 4 is a split detector. 

For use with a microtiter plate having a sample residing on the well 
bottom, the servo loop must, however, be broken to move between wells. This 
can result in substantial time delays because of the need to refocus each time the 
illumination is moved to another well. 

Continuous closed-loop control of the relative position of the sample 
plane and the object plane is provided in a preferred embodiment of the present 
invention, depicted in Figure 5. This system utilizes two independent beams of 
electromagnetic radiation. One, originating from S 5 , is focused on the 
continuous surface, exemplarily the bottom of a microtiter plate. The other, 
originating from S 4 , is focused on the discontinuous surface, exemplarily the 
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well bottom of a microliter plate. In one embodiment, the beams originating 
from S 4 and S 5 have wavelengths X4 and \ 5 , respectively. X4 is collimated by L 4 , 
apertured by iris U, and focused onto the discontinuous surface by the objective 
lens OL. X 5 is collimated by U, apertured by iris I 5 , and focused onto the 
continuous surface by the lens CFL in conjunction with the objective lens OL. 
The reflected light is focused onto the detectors D 4 and D 5 by the lenses IL 4 and 
IL 5 , respectively. The partially transmitting, partially reflecting mirror, DM 4 , is 
preferentially dichroic, reflecting X4 and X 5 and transmitting K and 5K, n=l,2,3. 
The mirrors, M4, M 5 and M 6 , are partially transmitting, partially reflecting. In 
the case that A* and X 5 are distinct, M 6 is preferentially dichroic. 

According to the embodiment wherein the sample resides in a microtiter 
plate, X4 is focused onto the well bottom. The object plane can be offset from 
the well bottom by a variable distance. This is accomplished by adjusting L 4 or 
alternatively by an offset adjustment in the servo control loop. For convenience 
of description, it will be assumed that X* focuses in the object plane. 

The operation of the autofocus system is as follows. If the bottom of the 
sample well is not in the focal plane of objective lens OL, detector D 4 generates 
an error signal that is supplied through switch SW to the Z control. The Z 
control controls a motor (not shown) for moving the microtiter plate toward or 
away from the objective lens. Alternatively, the Z control could move the 
objective lens. If the bottom PB of the microtiter plate is not at the focal plane 
of the combination of the lens CFL and the objective lens OL, detector D 5 
generates an error signal that is applied through switch SW to the Z control. An 
XY control controls a motor (not shown) for moving the microtiter plate in the 

object plane OP of lens OL. 

As indicated, the entire scan is under computer control. An exemplary 
scan follows: At the completion of an image in a particular well, the computer 
operates SW to switch control of the servo mechanism from the error signal 
generated by D 4 to that generated by D 5 ; the computer then directs the XY 
control to move the plate to the next well, after which the servo is switched back 
toD 4 . 
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The "coarse" focusing mechanism utilizing the signal from the bottom of 
the plate is used to maintain the position of the sample plane to within the well- 
to-well variations in the thickness of the plate bottom,, so that the range over 
which the "fine" mechanism is required to search is minimized. If, for example, 
5 the diameter of the iris I5 is 2 mm and IL 5 is 100 mm, then the image size on the 
detector will be - 100 [xm. Similarly, if the diameter of the iris I4 is 0.5 mm and 
,IL 4 is 100 mm, then the image size on the detector will be ~ 400 fim. The latter 
is chosen to be less sensitive so as to function as a "coarse" focus. 

As with the single-beam embodiment described above, the wavelengths 
10 X4 and X5 are necessarily distinct from the sample fluorescence, and 
preferentially wavelengths that cannot excite appreciable fluorescence in the 
sample. Thus, X4 and X 5 are preferentially in the near infrared, such as 800-1000 
nm. In addition, the two wavelengths are preferably distinct, for example X4 = 
830 nm, X 5 = 980 nm. 

15 In an alternative embodiment of two-beam autofocus, X4 = X 5 and the two 

beams may originate from the same source. Preferentially, the two beams are 
polarized perpendicular to one another and M 6 is a polarizing beamsplitter. 

Pseudo-closed loop control is provided in the preferred embodiment of 
single-beam autofocus which operates as follows. At the end of a scan the 

20 ^ computer operates SW to switch control to a sample-and-hold device which 
maintains the Z control output at a constant level while the plate is moved on to 
the next well after which SW is switched back to D4. 

Detection Devices 

25 A detection device is used having manifold, independent detection 

elements in a plane conjugate to the object plane. As discussed above, line 
illumination is advantageous principally in applications requiring rapid imaging. 
The potential speed increase inherent in the parallelism of line illumination as 
compared to point illumination is, however, only realized if the imaging system 

30 is capable of detecting the light emitted from each point of the sample along the 
illumination line, simultaneously. 
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It is possible to place a charge-coupled device (CCD), or other camera, 
at the output of the prior art imaging systems described above (White et al., US 
5,452,125 and Brakenhoff and Visscher, J. Microscopy 171 17-26 (1993)). The 
resulting apparatus has three significant disadvantages compared to the present 
invention. One is the requirement of rescanning the image onto the two- 
dimensional detector, which adds unnecessary complexity to the apparatus. 
Another is the requirement of a full two-dimensional detector having sufficient 
quality over the 1000 pixel x 1000 pixel array that typically constitutes the 
camera. The third disadvantage is the additional time required to read the mil 
image from the two-dimensional device. 

To avoid these disadvantages and optimize not only imaging speed, 
within the constraints of high-sensitivity and low-noise detection, but also 
throughput, a continuous-read line-camera is used and in a preferred 
embodiment a rectangular CCD is used as a line-camera. Both embodiments 
have no dead-time between lines within an image or between images. An 
additional advantage is that a larger effective field-of-view is achievable in the 
stage-scanning embodiment, discussed below. 

The properties required of the detection device can be further clarified by 
considering the following preferred embodiment. The resolution limit of the 
objective lens is < 1 /un, typically -0.5 /an, and the detector comprises an array 
of -1000 independent elements. Resolution, field-of-view (FOV) and image 
acquisition-rate are not independent variables, necessitating compromise among 
these performance parameters. In general, the magnification of the optical 
system is set so as to image as large a FOV as possible without sacrificing 
resolution. For example, a ~1 mm field-of-view could be imaged onto a 1000- 
element array at 1-fim pixelation. If the detection elements are 20-pm square, 
then the system magnification would be set to 20X. Note that this will not result 
in l-/mi resolution. Pixelation is not equivalent to resolution. If, for example, 
the inherent resolution limit of the objective lens is 0.5 fim and each 0.5 /mi X 
0.5 /mi region in the object plane is mapped onto a pixel, the true resolution of 
the resulting digital image is not 0.5 /mi. To achieve true 0.5-/im resolution, the 
pixelation would need to correspond to a region -0.2 /mi X 0.2 /mi in the object 
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plane. In one preferred embodiment, the magnification of the imaging system is 
set to achieve the true resolution of the optics. 

Presently, the highest detection efficiency, lowest noise detection 
devices having sufficient read-out speed for the present applications are CCD 
cameras. In Figures 6 A, 6B and 6C, a rectangular CCD camera is depicted 
having an m x n array of detector elements where m is substantially less than n. 
The image of the fluorescence emission covers one row that is preferably 
proximate to the read register. This minimizes transfer time and avoids 
accumulating spurious counts into the signal from the rows between the 
illuminated row and the read-register. 

In principle, one could set the magnification of the optical system so that 
the height of the image of the slit SF 2 on the CCD camera is one pixel, as 
depicted in Figures 4A, 4B and 4C. In practice, it is difficult to maintain perfect 
alignment between the illumination line and the camera row-axis, and even 
more difficult to maintain alignment among three cameras and the illumination 
in the multi-wavelength embodiment as exemplified in Figs. 3 and 4. By 
binning together a few of the detector elements, exemplarily two to five, in each 
column of the camera the alignment condition can be relaxed while suffering a 
minimal penalty in read-noise or read-time 

An additional advantage of the preferred embodiment having one or 
more rectangular CCD cameras as detection devices in conjunction with a 
variable-width detection spatial filter, SF 2 in Figs. 3 and 4 and 210 in Figure 2, 
each disposed in a plane conjugate to the object plane, is elucidated by the 
following. As discussed above, in one embodiment of the present invention the 
detection spatial filter is omitted and a line-camera is used as a combined 
detection spatial filter and detection device. But as was also discussed above, a 
variable-width detection spatial filter permits the optimization of the detection 
volume so as to optimize the sample-dependent signal-to-noise ratio. The 
following preferred embodiment retains the advantage of a line-camera, namely 
speed, and the flexibility of a variable detection volume. The magnification is 
set so as to image a diffraction-limited line of height h onto one row of the 
camera. The width of the detection spatial filter d is preferably variable h <d < 
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10h The detectors in the illuminated columns of the camera are binned, prior to 
reading, which is an operation that requires a negligible time compared to the 

exposure- and read-times. 

In one preferred embodiment, the cameras are Princeton Instruments 
NTE/CCD-1340/100-EMD. The read-rate in a preferred embodiment is 1 MHz 
at a few electrons of read-noise. The pixel format is 1340x100, and the camera 
can be wired to shift the majority of the rows (80%) away from the region of 
interest, making the camera effectively 1340x20. 

In addition to the above mentioned advantage of a continuous read 
camera, namely the absence of dead-time between successive acquisitions, an 
additional advantage is that it permits the acquisition of rectangular images 
having a length limited only by the extent of the sample. The length is 
determined by the lesser of the camera width and the extent of the lme 
illumination. In a preferred embodiment the sample is disposed on the bottom 
of a well in a 96-well microliter plate, the diameter of which is 7 mm. A strip 1 
M m X 1 mm is illuminated and the radiation emitted from the illuminated area is 
imaged onto the detection device. The optical train is designed such that the 
field-of-view is ~lmm 2 . According to the present invention, an image of the 
well-bottom can be generated at l-/im pixelation over a 1 X 7-mm field. 

p P A/irnr>Tnental Control 

In an embodiment of the present invention, assays are performed on live 
cells. Live-cell assays frequently require a reasonable approximation to 
physiological conditions to run properly. Among the important parameters is 
temperature. It is desirable to incorporate a means to raise and lower the 
temperature, in particular, to maintain the temperature of the sample at 37C. In 
another embodiment, control over relative humidity, and/or C0 2 and/or 0 2 is 
necessary to maintain the viability of live cells. In addition, controlling 
humidity to minimize evaporation is important for small sample volumes. 

Three embodiments providing a microtiter plate at an elevated 
temperature, preferably 37 degrees C, compatible with the LCI system follow. 
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The imaging system preferably resides within a light-proof enclosure. In 
a first embodiment, the sample plate is maintained at the desired temperature by 
maintaining the entire interior of the enclosure at that temperature. At 37 
degrees C, however, unless elevated humidity is purposefully maintained, 
evaporation cooling will reduce the sample volume limiting the assay duration. 

A second embodiment provides a heated cover for the microwell plate 
which allows the plate to move under the stationary cover. The cover has a 
single opening above the well aligned with the optical axis of the microscope. 
This opening permits dispensing into the active well while maintaining heating 
and limited circulation to the remainder of the plate. A space between the 
heated cover plate and microwell plate of approximately 0.5 mm allows free 
movement of the microwell plate and minimizes evaporation. As the contents of 
the interrogated well are exposed to ambient conditions though the dispenser 
opening for at most a few seconds, said contents suffer no significant 
temperature change during the measurement. 

In a third embodiment, a thin, heated sapphire window is used as a plate 
bottom enclosure. A pattern of resistive heaters along the well separators 
maintain the window temperature at the desired level. 

In additional embodiments, the three disclosed methods can be variously 
combined. 

In an additional preferred embodiment of the invention, employed in 
automated screening assays, the imaging system is integrated with plate- 
handling robots, such as the Zymark Twister. 

Data Processing System 

Figure 7 shows a schematic illustration of data processing components of a 
system arranged in accordance with the invention. The system, based on the 
Amersham Biosciences IN Cell Analyzer™ system, includes a confocal 
microscope 400 as described above, which includes the detectors D u D 2 , D 3 , D 4 , 
D 5 , the switch SW, a control unit 401, an image data store 402 and an 
Input/Output (I/O) device 404. An associated computer terminal 405 includes a 
central processing unit (CPU) 408, memory 410, a data storage device such as a 
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hard disc drive 412 and I/O devices 406 which facilitate interconnection of the 
computer with the MDPU and the computer with a display element 432 of a 
screen 428 via a screen I/O device 430, respectively. Operating system 
programs 414 are stored on the hard disc drive 412, and control, in a known 
manner, low level operation of the computer terminal 405. Program files and 
data 420 are also stored on the hard disc drive 412, and control, in a known 
manner, outputs to an operator via associated devices and output data stored on 
the hard disc drive. The associated devices include a display 432 as an element 
of the screen 428, a pointing device (not shown) and keyboard (not shown), 
which receive input from, and output information to, the operator via further I/O 
devices (not shown). Included in the program files 420 stored on the hard drive 
412 are an image processing and analysis application 416, an assay control 
application 418, and a database 422 for storing image data received from the 
microscope 400 and output files produced during data processing. The image 
processing and analysis application 418 maybe a customized version of known 
image processing and analysis software packages. 

The performance of an assay using the confocal microscope 400 is 
controlled using control application 418, and the image data are acquired. After 
the end of acquisition of image data for at least one well in a microtiter plate by 
at least one detector D„ D 2 , D 3 , the image data are transmitted to the computer 
405 and stored in the database 422 on the computer terminal hard drive 412, at 
which point the image data can be processed using the image processing and 
analysis application 416, as will be described in greater detail below. 

T uminescent R eporters Expressed in Cells 

Numerous variations of the assay methods described below can be 
practiced in accordance with the invention. In general, a characteristic spatial 
and/or temporal distribution of one or more luminescence reporters in cells is 
used to quantify the assay. Advantageously, luminescence is observed from an 
essentially planar surface using a line-scan confocal microscope as described 
above. 
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In preferred embodiments of the invention, luminescent reporters are 
provided in a manner as described in our previous International patent 
application WO 03/031612. The position in the cell cycle of a population of 
cells is determined by: 

a) expressing in the cells a nucleic acid reporter construct, 
preferably a DNA construct, comprising a nucleic acid sequence encoding a 
detectable live-cell reporter molecule, operably linked to and under the control 
of: 

i) at least one cell cycle phase-specific expression control 

element, and 

ii) a destruction control element; 

- wherein said reporter construct is expressed in a cell at a 
predetermined point in the cell cycle; and 

b) determining the position of cells in the cell cycle by monitoring 
luminescent signals emitted by the reporter molecule. 

The nucleic acid reporter construct is also preferably linked to and under 
the control of a cell cycle phase-specific spatial localisation control element. 

The cell cycle phase-specific expression control element is typically a 
DNA sequence that controls transcription and/or translation of one or more 
nucleic acid sequences and permits the cell cycle specific control of expression. 
Any expression control element that is specifically active in one or more phases 
of the cell cycle may suitably be used for construction of the cycle position 
reporter construct. 

Suitably, the cell cycle phase specific expression control element may be 
selected from cell cycle specific promoters and other elements that influence the 
control of transcription or translation in a cell cycle specific manner. Where the 
expression control element is a promoter, the choice of promoter will depend on 
the phase of the cell cycle selected for study. 

Suitable promoters include: cyclin Bl promoter (Cogswell et al, Mol. 
Cell Biol., (1995), 15(5), 2782-90, Hwang et al, J.Biol.Chem., (1995), 270(47), 
28419-24, Piaggio et al, Exp. Cell Res., (1995), 216(2), 396-402); Cdc25B 
promoter (Korner et al, J.Biol.Chem., (2001), 276(13), 9662-9); cyclin A2 
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promoter (Henglein et al, Proc.Nat.Acad.Sci.USA, (1994), 91(12), 5490-4, 
Zwicker et al, Embo J., (1995), 14(18), 4514-22); Cdc2 promoter (Tommasi and 
Pfeifer, Mol. Cell Biol., (1995), 15(12), 6901-13, Zwicker et al, Embo J (1995), 
14(18), 4514-22), Cdc25C promoter (Korner and Muller, J.Biol.Chem., (2000), 
275(25) 18676-81, Korner et al, Nucl. Acids Res., (1997), 25(24), 4933-9); 
cyclin E promoter (Bote et al, Mol. Cell Biol., (1996), 16(7), 3401-9, Korner 
and Muller, J.Biol.Chem., (2000), 275(25), 18676-81); Cdc6 promoter 
(Hateboer et al, Mol. Cell Biol., (1998), 18(11), 6679-97, Yan et al, 
Proc.Nat.Acad.Sci.USA, (1998), 95(7), 3603-8); DHFR promoter (Shimada et 
al, J.Biol.Chem., (1986), 261.(3), 1445-52, Shimada and Nienhuis, J.Biol.Chem., 
(1985), 260(4), 2468-74) and histones promoters (van Wijnen et al, 
Proc.Nat.Acad.Sci.USA, (1994), 91, 12882-12886). 

Suitably, the cell cycle phase specific expression control element may be 
selected from cell cycle specific IRES elements and other elements that 
influence the control of translation in a cell cycle specific manner. An JRES 
element is an internal ribosomal entry site that allows the binding of a ribosome 
and the initiation of translation to occur at a region of mRNA which is not the 
5'-capped region. A cell cycle-specific IRES element restricts cap-independent 
initiation of translation to a specific stage of the cell cycle (Sachs, A.B., Cell, 
(2000), 10L 243-5). Where the expression control element is selected to be an 
IRES, suitably its selection will depend on the cell cycle phase under study. In 
this case, a constitutively expressed (e.g. CMV or SV40) or inducible (e.g. pTet- 
on pTet-off system, Clontech) promoter may be used to control the transcription 
of the bicistronic mRNA (Sachs, A.B., Cell, (2000), I0L 243-5). Alternatively, 
a non cell cycle phase-dependent IRES element (e.g. the EMCV IRES found in 
pIRES vectors, BD Clontech) may be used in conjunction with a cell cycle 
specific promoter element. Alternatively, more precise control of expression of 
the reporter may be obtained by using a cell cycle phase specific promoter in 
conjunction with a cell cycle phase specific IRES element. 

IRES elements suitable for use in the invention include: G2-IRES 
(Cornells et al, Mol. Cell, (2000), 5(4), 597-605); HCV IRES (Honda et al, 
Gastroenterology, (2000), US, 152-162); ODC IRES (Pyronet et al, Mol. Cell, 
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(2000), 5, 607-616); c-myc IRES (Pyronnet et al, Mol. Cell, (2000), 5(4), 607- 
16) and p58 PITSLRE IRES (Cornelis et al, Mol. Cell, (2000), 5(4), 597-605). 

Table 1 lists some preferred expression control elements that may be 
used in accordance with the invention, and indicates the cell cycle phase in 
5 which each element is activated. 



Table 1 : Cell Cycle Phase-Specific Expression Control Elements 



Element > 


Timing 


Element 


Timing 


Cyclin Bl promoter 


G2 


DHFR promoter 


late Gl 


Cdc25B promoter 


S/G2 


Histones promoters 


late Gl/S 


Cyclin A2 promoter 


S 


G2-IRES 


G2 


Cdc2 promoter 


S 


HCVIRES 


M 


Cdc25C promoter 


S 


ODCIRES 


G2/M 


Cyclin E promoter 


late Gl 


c-myc IRES 


M 


Cdc6 promoter 


late Gl 


p58 PITSLRE IRES 


G2/M 



The destruction control element is a DNA sequence encoding a protein 
10 motif that controls the destruction of proteins containing that sequence. 
Suitably, the destruction control element may be cell cycle mediated, for 
example: Cyclin Bl D-box (Glotzer et al, Nature, (1991), 349, 132-138, 
Yamano et al, EMBO J., (1998), 17(19), 5670-8, Clute and Pines, Nature Cell 
Biology, (1999), 1, 82 - 87); cyclin A N-terminus (den Elzen and Pines, J. Cell 
15 Biol., (2001), 153(1), 121-36, Geley et al, J. Cell Biol., (2001), 153, 137-48); 
KEN box (Pfleger and Kirschner, Genes Dev, (2000), 14(6), 655-65), Cyclin E 
(Yeh et al, Biochem Biophys Res Commun., (2001) 281, 884-90), Cln2 cyclin 
from S. cerevisiae (Berset et al, Mol. Cell Biol., (2002), pp4463-4476) and 
p27Kipl (Montagnoli et al, Genes Dev., (1999), 13(9), 1 181-1 189, Nakayama et 
20 al, EMBO J., (2000), 19(9), 2069-8 1 , Tomoda et al, Nature, (1 999), 398(6723). 
160-5). 

Table 2 lists destruction control elements that may be used according to 
the invention and indicates the cell cycle phase, in which each element is 
activated. 
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Table 2: Destruction Control Elements 



Element 


Timing 


CyclinBl D-box 

Cyclin A N-terminus 

KEN box 

p27Kipl 

Cyclin E 

Cln2 


Metaphase through to Gl phase 
Prometaphase through to Gl phase 
anaphase/Gl 
Gl 

Gl/S boundary 
Gl/S boundary 



Alternatively, the destruction control element may be non cell-cycle 
5 mediated, such as PEST sequences as described by Rogers et al, Science, 
(1986), 234, 364-8. Examples of non cell-cycle mediated destruction control 
elements include sequences derived from casein, ornithine decarboxylase and 
proteins that reduce protein half-life. Use of such non cell-cycle mediated 
destruction control sequences in the method of the invention provides means for 

10 determining the persistence time of the cell cycle reporter following induction of 
expression by a cell cycle specific promoter. 

Suitably, the live-cell reporter molecule encoded by the nucleic acid 
sequence may be selected from the group consisting of fluorescent proteins and 
enzymes. Preferred fluorescent proteins include Green Fluorescent Protein 

15 (GFP) from Aequorea victoria and derivatives of GFP such as functional GFP 
analogues in which the amino acid sequence of wild type GFP has been altered 
by amino acid deletion, addition, or substitution. Suitable GFP analogues for 
use in the present invention include EGFP (Cormack, B.P. et al, Gene, (1996), 
173, 33-38); "EYEP and ECFP (US 6066476, Tsien, R. et al); F64L-GFP (US 

20 6172188, Thastrup, O. et al); BFP, (US 6077707, Tsien, R. et al). Other 
fluorescent proteins include DsRed, HcRed and other novel fluorescent proteins 
(BD Clontech and Labas, Y.A. et al, Proc Natl Acad Sci U S A (2002), 99, 
4256-61) and Renilla GFP (Stratagene). Suitable enzyme reporters are those 
which are capable of generating a detectable (e.g. a fluorescent or a luminescent) 

25 signal in a substrate for that enzyme. Particularly suitable enzyme/substrates 
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ophonaHy ,„c,ude . ce „ cycfc phase . specjfic iocaijs J ^ 

5 compnsmg a DNA s« q ue„ce encoding a protein mohf that is cap ll! 
controlhng the sub . cellular local , ation of fte protefa 

nranne, Such a location con.ro, Cement may he used 

according to the invention where: antageously 

.0 and/or " * ^ ° f «» «">— «■ **** 

H) -"P-isedetenninationofdteceUcycleposidonisreonited 
may be required to dCetmine the sub-cehular localisation of the 
reporter etther to ensnre its effective operation and/or destruction. uZ p j 
detemimation of the cell cvrle ™-« . precise 

" contro, Content si „cel 7w ^ ^ ^ ' » 

tocauonofdterepottersja, ' ^ ^ - 

Suitable spatia, .ocahsation conhol elements inCude those that regnlate 
.ocahsanon of a ceU cycle contm, protein, f or example ^ cycJin B , ^ 
The term "operabiy linked" as used herein indicates that the elements are 

0 arranged so that they function in concert for their intended ptupoTeT 

ift— *~ - - ~ 

illustrate the . , ^ FigUreS 8A ' 8B ™» *C 

illustrate the general construction of a twa 

i™» • A construc t according to the 

invention, m which Fieure 8 A oh™, - 

1 snecifienm . J Dre8Asl »"« a construct utilising a celi cycle phase- 
spectfic promoter and no interna, ribosonte entry she (IRES) Cement, Figl 8B 

5* u,,lisin8 - Ms <° — — S ZZ 

and Ftgure 8C shows a construe, utilising a constihttive or indneibie mammalian 
~ - • - -.e phase-specmc MS as the e.presston con JIT 

(promoter)^ ' """^ expression control 

promoter), B represents a eel, cyde phase specific destmcuon conho, Cemen, 
C represents a cel, cycle phase specific localisation contro, e, Je„ ^ 
represents a reporter gene, E represents a non-cel, cycle specific KEsTlel J 



33 

F represents a mammalian selectable marker, G represents a mammalian 
constitutive promoter and H represents a cell cycle specific IRES element 

In a preferred embodiment of the invention, the construct comprises a 
cyclin Bl promoter, a cyclin Bl destruction box (D-box), a cyclin Bl 
cytoplasmic retention sequence (CRS) and a green fluorescent protein (GFP). 

In one embodiment, the nucleic acid reporter construct comprises an 
expression vector comprising the following elements: 

a) a vector backbone comprising: 

i) a bacterial origin of replication; and 

ii) a bacterial drug resistance gene; 

b) a cell cycle phase specific expression control element; 

c) a destruction control element; and 

d) a nucleic acid sequence encoding a reporter molecule. 
Optionally, the nucleic acid reporter construct additionally contains a 

cell cycle phase-specific spatial localisation control element and/or a eukaryotic 
drug resistance gene, preferably a mammalian drug resistance gene. 

Expression vectors may also contain other nucleic acid sequences, such 
as polyadenylation signals, splice donor/splice acceptor signals, intervening 
sequences, transcriptional enhancer sequences* translational enhancer sequences 
and the like. Optionally, the drug resistance gene and the reporter gene may be 
operably linked by an internal ribosome entry site (IRES), which is either cell 
cycle specific (Sachs, et al, Cell, (2000), 101. 243-245) or cell cycle independent 
(Jang et al, J. Virology, (1988), 62, 2636-2643 and Pelletier and Sonenberg, 
Nature, (1988), 334, 320-325), rather than the two genes being driven from 
separate promoters. When using a non cell-cycle specific IRES element the 
pIRES-neo and pIRES-puro vectors commercially available from Clontech may 
be used. 

In a particular embodiment, the nucleic acid reporter construct is 
assembled from a DNA sequence encoding the cyclin Bl promoter operably 
linked to DNA sequences encoding 171 amino acids of the amino terminus of 
cyclin Bl and a DNA sequence encoding a green fluorescent protein (GFP) 
(Figure 9). The construct illustrated in Figure 9 contains a cyclin Bl promoter 
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(A), cyclin Bl destruction box (D-box) (B), cyclin Bl CRS (C) and a GFP 
reporter (D). Motifs controlHng the localisation and destruction of cyclin Bl 
have all been mapped to mmo acids fa ^ ^ rf ^ 

molecule. Continently, an artificial cell cycle marker can be constructed using 
only sequences ftom the amino tanrinus of cyclin Bl, which will not interfere 
w.«h cell cycie progression since if lacks a specific sequence, termed the cyclin 
box, (Nugent e, al, J. Cell. Set., (1991), 99, 669-674) which is required to bind 
to and activate a partner kinase. Key regulatory motifs required fiom the amino 
terminus sequence of cyclin B 1 are: 

10 i) a nine amino acid motif termed the destruction box (D-box) 

Th 1S ,s necessary to target cyclin Bl to the ubiquitination machinery and in 
conjunction with at least one C-terminal lysine residue, this is also required for 
its cell-cycle specific degradation; 

ii) an approximately ten amino acid nuclear export signal (NES). This 
«*f is recognised, either directly or indirectly, by exportin 1 and is sufficient 
to mamtain the bulk of cyclin Bl in the cytoplasm throughout interphase- 

ni) approximately four mitosis-specific phosphorylation sites mat are 
located in and adjacent to the NES and confer rapid nuclear import and a 
reduced nuclear export at mitosis. 
■0 When expressed in a euktuyotie cell, the construct will exhibit cell cycle 

spectfic expression and destruction of fire GFP reporter which parallels the 
expression and degradation of endogenous cyclin Bl. Hence, measurement of 
GFP fluorescence intensity permits identification of cells in the G2/M phase of 
ute cell cycle (Figure 10). Furthennore, since the fluorescent product of me 
> constioc. will mimic the spatial localisation of endogenous cyclin Bl, analysis 
of the snb-cellular distribution of fluorescence permits former precision in 
astagnfag cell cycle position. A, prophase, cyclin Bl rapidly translocates info 
the nucleus, consequently the precise localisation of GFP fluorescence in the 
cell can be used to discriminate cells transitioning from infetphase ,„ mifosis 
Once a cell reaches mefaphase, and the spindle assembly checkpoint is satisfied 
cycltn Bl ,s very rapidly degraded, and consequenfly the disappearance of GFP 
fluorescence can be used to identify cells at mid-M phase. 
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Expression of the construct in a population of unsynchronised cells will 
result in each cell exhibiting cyclical expression and destruction of the 
fluorescent product from the construct, resulting in a continuous blinking pattern 
of fluorescence from all cells in the population. Analysis of the fluorescence 
intensity of each cell with time consequently yields dynamic information on the 
cell cycle status of each cell. 

Further embodiments of the nucleic acid reporter construct according to 
the first aspect may be constructed by selecting suitable alternative cell cycle 
control elements, for example from those shown in Tables 1 and 2, to design cell 
cycle phase reporters which report a desired section of the cell cycle. 

The construction and use of expression vectors and plasmids are well 
known to those of skill in the art. Virtually any mammalian cell expression 
vector may be used in connection with the cell cycle markers disclosed herein. 
Examples of suitable vector backbones which include bacterial and mammalian 
drug resistance genes and a bacterial origin of replication include, but are not 
limited to: pCI-neo (Promega), pcDNA (Invitrogen) and pTriExl (Novagen). 
Suitable bacterial drug resistance genes include genes encoding for proteins that 
confer resistance to antibiotics including, but not restricted to: ampicillin, 
kanamycin, tetracyclin and chloramphenicol. Eurkaryotic drug selection 
markers include agents such as: neomycin, hygromycin, puromycin, zeocin, 
mycophenolic acid, histidinol, gentamycin and methotrexate. 

The DNA construct may be prepared by the standard recombinant _ 
molecular biology techniques of restriction digestion, ligation, transformation 
and plasmid purification by methods familiar to those skilled in the art and are 
as described in Sambrook, J. et al (1989), Molecular Cloning - A Laboratory 
Manual, Cold Spring Harbor Laboratory Press. Alternatively, the construct can 
be prepared synthetically by established methods, e.g. the phosphoramidite 
method described by Beaucage and Caruthers, (Tetrahedron Letters, (1981), 22, 
1859-1869) or the method described by Matthes et al (EMBO J., (1984), 3, 801- 
805). According to the phosphoramidite method, oligonucleotides are 
synthesised, e.g. in an automatic DNA synthesizer, purified, annealed, ligated 
and cloned into suitable vectors. The DNA construct may also be prepared by 
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polymerase chain reaction (PGR) using specific primers, for instance as 
described in US4683202 or by Saiki et al (Science, (1988), 239, 487-491) A 
review of PGR methods may be found in PGR protocols, (1990), Academic 
Press, San Diego, California, U.S. A. 

During the preparation of the DNA construct, the gene sequence 
encoding the reporter must be joined in frame with the cell cycle phase specific 
destruction control element and optionally the spatial localisation control 
element. The resultant DNA construct should then be placed under the control 
of one or more suitable cell cycle phase specific expression control elements 

The host cell into which the construct or the expression vector containing 
such a construct is introduced, may be any cell which is capable of expressing 
the construct and may be selected from eukaryotic cells for example, from the 
group consisting of a mammalian cell, a fungal cell, a nematode cell, a fish cell 
an amphibian cell, a plant cell and an insect cell. 

■ The prepared DNA reporter construct may be transfected into a host cell 
usmg techniques well known to the skilled person. One approach is to 
temporanly permeabilise the cells using either chemical or physical procedures 
These techniques may include: electroporation (Tur-Kaspa et al, tool. Cell Biol 
(1986), 6, 716-718; Potter et al, Proc.Nat.Acad.Sci.USA, (1984), 81 7161- 
7165), a calcium phosphate based method (eg. Graham and Van der Eb 
Urology, (1973), 52, 456-467 and Rippe et al, Mol. Cell Biol., (1990), 10, 689- 
695) or direct microinjection. 

Alternatively, cationic lipid based methods (eg. the use of Superfect 
(Qiagen) or Fugene6 (Roche) may be used to introduce DNA into cells (Stewart 
et al, Human Gene Therapy, (1992), 3, 267; Torchilin et al, FASEB J, (1992) 6 
2716; Zhu et al, Science, (1993), 261, 209-211; Ledley et al, J. Pediatrics' 
(1987), 110, 1; Nicolau et al, Proc.Nat. Acad.Sci.,USA, (1983), 80, 1068- 
Ncolau and Sene, Biochem.Biophys.Acta, (1982), 721, 185-190) Jiao et al' 
Biotechnology, (1993), 11, 497-502) describe the use of bombardment mediated 
gene transfer protocols for transferring and expressing genes in brain tissues 
which may also be used to transfer the DNA into host cells. 
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A further alternative method for transfecting the DNA construct into 
cells, utilises the natural ability of viruses to enter cells. Such methods include 
vectors and transfection protocols based on, for example, Herpes simplex virus 
(U.S. Pat 5288641), cytomegalovirus (Miller, Curr. Top. Microbiol. Immunol., 
5 (1992), 158, 1), vaccinia virus (Baichwal and Sugden, 1986, in Gene Transfer, 
ed. R. Kucherlapati, New York, Plenum Press, pi 17-148), and adenovirus and 
adeno-associated virus (Muzyczka, Curr. Top. Microbiol. Immunol., (1992), 
158, 97-129). 

Examples of suitable recombinant host cells include HeLa cells, Vero 
10 cells, Chinese Hamster ovary (CHO), U20S, COS, BHK, HepG2, NIH 3T3 
MDCK, RIN, HEK293 and other mammalian cell lines that are grown in vitro. 
Such cell lines are available from the American Tissue Culture Collection 
(ATCC), Bethesda, Maryland, U.S.A. Cells from primary cell lines that have 
been established after removing cells from a mammal followed by culturing the 
15 cells for a limited period of time are also intended to be included in the present 
invention. 

Cell lines which exhibit stable expression of a cell cycle position reporter 
may also be used in establishing xenografts of engineered cells in host animals 
using standard methods. (Krasagakis, K.J et al, Cell Physiol., (2001), 187(3), 
20 386-91; Paris, S. et al, Clin.Exp.Metastasis, (1999), 12(10), 817-22). Xenografts 
of tumour cell lines engineered to express cell cycle position reporters will 
enable establishment of model systems to study tumour cell division, stasis and 
. metastasis and to screen new anticancer drugs. 

Use of engineered cell lines or transgenic tissues expressing a cell cycle 
25 position reporter as allografts in a host animal will permit study of mechanisms 
affecting tolerance or rejection of tissue transplants (Pye D and Watt, D.J., J. 
Anat., (2001), 198 (Pt 2), 163-73; Brod, S.A. et al, Transplantation (2000), 
69(10), 2162-6). 

To perform the method for determining the cell cycle position of a cell 
30 according to the second aspect, cells transfected with the DNA reporter 
construct may be cultured under conditions and for a period of time sufficient to 
allow expression of the reporter molecule at a specific stage of the cell cycle. 
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Typically, expression of the reporter molecule will occur between 16 and 72 
hours post transfection, but may vary depending on the culture conditions If the 
reporter molecule is based on a green fluorescent protein sequence the reporter 
may take a defined time to fold into a conformation that is fluorescent This 
time dependent upon the primary sequence of the green fluorescent protein 
denvative being used. The fluorescent reporter protein may also change colour 
with time (see for example, Terskikh, Science, (2000), 290, 1585-8) in which 
case ,maging is required at specified time intervals following transfection. 

In the embodiment of the invention wherein the nucleic acid reporter 
construct comprises a drug resistance gene, following transfection and 
expression of the drug resistance gene (usually 1 - 2 days), cells expressing the 
modified reporter gene may be selected by growing the cells in the presence of 
an antibiotic for which transfected cells are resistant due, to the presence of a 
selectable marker gene. The purpose of adding the antibiotic is to select for 
cells that express the reporter gene and that have, in some cases, integrated the 
reporter gene, with its associated promoter, IRES elements, enhancer and 
termination sequences into the genome of the cell line. Following selection a 
clonal cell line expressing the construct can be isolated using standard 
techniques. The clonal cell line may then be grown under standard conditions 
and will express reporter molecule and produce a detectable signal at a specific 
point in the cell cycle. 



Examples of Production of Stable CV11 T jngs 

Example 1 - Preparation pfPNA ron g tr,w 

25 0 The N-terminal third of the cyclin Bl mRNA (amino acids 1- 

171), encoding the cyclin Bl destruction box and the NES was amplified with 
HindHI and BamHI ends using standard PGR techniques and the following 
primers: 
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GGGAAGCTTAGGATGGCGCTCCGAGTCACCAGGAAC 
GCCGQATCCCACATATTCACTACAAAGGTT. 
") The gene for wtGFP was amplified with primers designed to 
introduce restriction sites that would facilitate construction of fusion proteins. 
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The PGR product was cloned into pTARGET (Promega) according to 
manufacturers instructions and mutations (F64L/S175G/E222G) were 
introduced using the QuikChange site-directed mutagenesis kit (Stratagene). 
Constructs were verified by automated DNA sequencing. DNA encoding the 
5 mutant GFP was then cloned downstream of the cyclin Bl N-terminal region 
using BamHI and Sail restriction sites. 

iii) The cell cycle dependent region of the cyclin Bl promoter (-150 
-> +182) was amplified with SacII and Hindlll sites and cloned upstream of the 
Cyclin Bl N-terminal region and the GFP fusion protein. 
10 iv) The promoter and recombinant protein encoding DNA was 

excised arid cloned in place of the CMV promoter in a Bglll/Nhel cut pCI-Neo 
derived vector. 

Example 2 - Effect of cell cycle blocking agents on GFP fluorescence 
15 from cell cycle phase marker using transiently transfected cells 

U20S cells (ATCC HTB-96) were cultured in wells of a 96 well 
microtitre plate. Cells were transfected with a cell cycle reporter construct 
prepared according to Example 1, comprising a cyclin Bl promoter operably 
linked to sequences encoding the cyclin Bl D-box, the cyclin Bl CRS, and GFP 
20 in a pCORON4004 vector (Amersham Biosciences) using Fugene 6 (Roche) as 
the transfection agent. 

Following 24 hours of culture, cells were exposed to the specific cell 
cycle blockers mimosine (blocks at Gl/S phase boundary) or demecolcine 
(blocks in M phase). Control cells were exposed to culture media alone. 
25 Cells were incubated for a further 24 hours and then analysed for nuclear 

GFP expression using a confocal scanning imager with automated image 
analysis (IN Cell Analysis System, Amersham Biosciences). 

Cells exposed to demecolcine showed increased fluorescence compared 
to control cells while cells exposed to mimosine showed decreased fluorescence 
30 compared to control cells. Cells blocked in Gl/S phase (mimosine treated), 
prior to the time of activation of the cyclin Bl promoter, show reduced 
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fluorescence, while cells blocked in M phase (demecolcine treated), prior to the 
time of action of the cyclin Bl D-box, show increased fluorescence. 

Example 3 - Microinjection of the nnnswt 
: HeLa cells were micro-injected with the construct prepared according to 

Example 1 and examined by time lapse microscopy. Differential interference 
contrast (DIG) images were made along with the corresponding fluorescence 
images. A cell in metaphase showed bright fluorescence in the nucleus. The 
same cell was imaged similarly at later times in anaphase and late anaphase. The 
DIG . images showed the division of the cell into two daughter cells the 
corresponding fluorescence images showed the loss of fluorescence 
accompanying destruction of the fluorescent construct as the cell cycle 
progresses. 

Example 4 - Stable cell line production 

U2-OS cells (ATCC HTB-96) were transfected with the construct 
described in Example 1 and grown for several months in culture media 
containing 1 mg/ml geneticin to select for cells stably expressing the construct 
A number of clones were picked by standard methods (e.g. described in 
Freshney, Chapter 11 in Culture of Animal Cells, (1994) Wiley-Liss Inc) and a 
clone containing fluorescent cells was isolated. This cell line was maintained at 
37°C in culture media containing 25 mM HEPES. 

Example 5 - Preparation of a brighter stable r P 11 line 
The green fluorescent protein reporter sequence in the vector described 
in example 1 was replaced with enhanced GFP (EGFP; Cormack, BP et al 
Gene, (1996), 173, 33-38; BDClontech) by standard methods. The EGFP gene' 
is a brighter form of GFP containing the mutations F64L and S65T. In addition 
EGFP contains codons that have been altered to optimise expression in' 
mammalian cells. This new construct was transfected into U2-OS cells and a 
number of colonies were isolated by selection with geneticin followed by sorting 
of single cells using a fluorescence activated cell sorter. These clones showed 
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brighter fluorescence than those generated in example 4 and as expected 
fluorescence intensity and location appeared to vary according to the cell cycle 
phase of the cell. 

Assays and Image Acquisition 

According to embodiments of the invention, screening assays are 
conducted using libraries of chemical compounds. One or more multrwell plates 
are prepared using a cell line as described above. Whilst in the following 
embodiments a cell line including a cell cycle reporter construct as described in 
Example 1 above is used, it should be appreciated that any other of the described 
embodiments of cell line can be used in alternative embodiments. A controlled 
amount of cells, referred to herein as a population is placed in a carrier solution 
in each of the wells of the plate and allowed to establish for a predetermined 
period, for example 24 hours. Next, a different one of the library of chemical 
compounds is added in a controlled concentration and amount to each of the 
wells and allowed to stand for a predetermined period, for example 24 hours. In 
some embodiments of the invention, a nuclear stain is added before imaging is 
conducted. In other embodiments of the invention, no nuclear stain is added 
before imaging is conducted. Next, imaging is conducted for each well of the 
plate in turn, using a confocal microscope as described above. A small area in 
the centre of each well, at the bottom of the well, is imaged to acquire image 
data in one or more channels of the selected area. The fluorescence detected in 
the confocal microscope is converted into one or more digital images in which 
the digital values are proportional to the intensity of the fluorescent radiation 
incident on each pixel of the detection device. 
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Image Processing and Analyse . 

In general the processing and analysis of the image data in accordance 
with the invention comprises a number of discrete steps. The image data are 
analyzed to identify areas of image data corresponding to individual cells, as in 
step 1 of Figure 1. Such object areas may be sub-cellular components of 
individual cells, such as the cell nuclei. A binary mask is generated from one of 
the digital images in which all values meeting one or more criteria are replaced 
by one, all values failing to meet the criteria are replaced by zero. Generally, the 
one or more criteria include a threshold value determined from an image taken 
in a set-up procedure for the assay. The mask is searched for groups of 
contiguous value-one pixels to identify the object areas corresponding to 
individual cells. Next, measurements are made on the individual cells using the 
identified object areas. 

The cell cycle phase marker used has a fluorescence signal that varies 
according to the phase of the cell cycle of the cell in a manner which is 
illustrated in Figure 11. Four different patterns can be distinguished in this 
embodiment of the invention: 

1. G0/G1/S phase cells have relatively low expression of the cell cycle 
phase marker, both in the nucleus and the cytoplasm; 
20 2. G2 cells have relatively low nuclear, and relatively high cytoplasmic, 
expression of the cell cycle phase marker 

3. M cells have relatively high nuclear expression, with substantially no 
cytoplasmic expression of the cell cycle phase marker; 

4. P cells have relatively high nuclear, and relatively high cytoplasmic, 
25 expression of the cell cycle phase marker. 

Furthermore, in an embodiment the mitotic cells can be distinguished 
into MP (metaphase) cells. A (anaphase) cells, T (telophase) cells and C 
(cytokinesis) cells. Schematic illustrations of the signal intensities and 
distributions of the fluorescent reporter in these cell cycle phases are shown in 
Figure 12. Early Gl phase cells can also be distinguished in this embodiment. 

A nuclear marker, producing fluorescence at a wavelength different to 
that of the cell cycle phase marker, is used in another embodiment to identify 
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nuclear areas for each cell under analysis in the image data. The nuclear marker 
may be one of the toxic intercalating nuclear dyes (such as DRAQ5™ or a 
Hoechst™ dye, for example Hoechst 33342). Alternatively, in assays in which 
the same cell population is imaged and analysed to determine its relative cell 
5 cycle sub-populations a number of times during a time course study, a non-toxic 
nuclear marked may be used. Such a non-toxic marker may be in the form of an 
NLS-fluorescent protein fusion. For example, the Clontech™ pHcRedl-Nuc 
vector, when transfected into a cell line in accordance with the present 
invention, produces a red fluorescence signal in the nucleus. During image 

10 acquisition, an image of the cell nuclei is acquired in a first channel 
corresponding to the nuclear marker, a cell cycle phase analysis image is 
acquired in a second channel corresponding to the cell phase marker, and the 
two images are coregistered such that the pixels of each image are aligned. 

The cell nuclei image is analysed first to identify nuclear areas in the 

1 5 image data. A nuclear signal threshold may be set to accurately differentiate the 
edges of the nuclear areas. A segmentation algorithm, for example a watershed 
segmentation algorithm (S. Beucher, F. Meyer, "Morphological Segmentation", 
Journal of Visual Communication and Image Representation, 1:21-46, 1990 and 
Vincent, Soille, IEEE Transactions on Pattern Analysis and Machine 

20 Intelligence, 13:583-598, 1991) is applied to the thresholded image to uniquely 
identify the area of the nucleus of each individual cell being analysed. 

From each nuclear object area identified, two binary masks, defining 
object areas in which the cell measurements are to be taken, are generated - an 
eroded nuclear mask (to sample the cell cycle phase marker intensity signal in 

25 the central part of nucleus) and a thin cytoplasmic ring (to sample the cell cycle 
phase marker intensity signal in the cytoplasm near the nucleus). The nuclear 
object area is eroded from the edge of the nuclear object by a predetermined 
number of pixels, for example three pixels, to generate the eroded nuclear mask. 
To generate the thin cytoplasmic ring, representing the cytoplasmic area 

30 adjacent to the nucleus, the nuclear object is dilated from its edge by a 
predetermined number of pixels, for example two pixels. 
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Measurem ents on individual cells 

The two masks, generated for each individual cell as described above are 
then applied to the cell cycle phase analysis image. 

Measurements are then derived from the image data, as in step 2 of 
Figure 1 . The fluorescence signal intensities in each pixel in the eroded nuclear 
mask area are averaged to produce a measurement of the average nuclear signal 
intensity (/„ ) parameter which represents the average intensity over the nuclear 



area. 



The.fluorescence signal intensities in each pixel in the thin cytoplasmic 
10 nng are averaged to produce measurement of the average cytoplasmic signal 
intensity (I e ) parameter representing the average intensity within cytoplasmic 
sampling ring. Note that, in the case of a mitotic cell, the "cytoplasmic" 
luminescence signal intensity, whilst taken in an area generally corresponding to 
a cytoplasmic component of a selected cell, is mainly outside the bounds of the 
1 5 cell (and therefore tends to be relatively low.) 

The ratio of the two measured average intensities is then taken to 
generate the nuclear/cytoplasmic ratio <i) parameter, representing the ratio of 

C 

nuclear and cytoplasmic average intensities; 

A parameter set is associated with each cell identified from one or more 
» object areas in the image data. The parameter set is derived from the 
measurements taken from the image data, at step 3 of Figure 1. m the present 
example, the parameter set includes a floating point number representative of 
the nuclear/cytoplasmic ratio. However, the parameter set may consist of any 
number of measurements derived from the image. For example, if the cells were 
5 expressing several markers, the intensity of each marker would be a 
measurement for inclusion in the parameter set. The parameter set is derived 
automatically for each cell which is identified from an object area and the 
appropriate measurement values are put in the parameter set. The parameter set 
is saved to a database of classifying data when the associated cell is classified 
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Measurements may be taken from any identified object area. For 
example, if the nucleus and cytoplasm are identified as object areas, one or more 
measurements could be taken from both either. 

Measurements may be derived from a variety of parameters, including: 

/, a parameter relating to an average image signal intensity within an 

identified object area; 

F 9 a parameter relating to a fraction of pixels that deviate more than a 
given amount from an average signal intensity within an identified object area; 

H, a parameter relating to the number of pixels with a signal intensity 
below a given threshold within an identified object area; 

A, a parameter relating tct a ratio between major and minor axes of an 
elliptical outline corresponding to an identified object area; 

i?, a parameter relating to a maximum width of an identified object area; 

L, a parameter relating to an average width of an identified object area; 

C, a parameter relating to signal texture within an identified object area; 

M 9 a parameter relating to margination in an identified object area. 
Other, parameters which may be used include the presence, absence or 
concentration of cellular features such as neurites, membrane ruffles, 
cytoplasmic granules, etc. If a plurality of measurements is taken for a plurality 
of parameters, one or more of the measurements may be weighted in statistical 
importance. The measurement of a parameter that is known to be more reliably 
indicative of cell cycle phase would be weighted, as opposed to a parameter 
which is not as reliably indicative. 

In embodiments where n measurements are taken from the cell image 
data, the parameter set may be represented as an ^-dimensional volume in a 
space. This volume is a feature vector, in a feature space. The representation of 
the parameter set as a feature vector in a feature space is described in more 
detail below. 

Each cell identified as an object area from the image data is identified as 
being a member of a subpopulation, initially by a user, as in step 4 of Figure 1. 
In an embodiment, the user may make an identification by selecting a cell by 
right-clicking a mouse when pointing at the cell on a screen, and then enter the 
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identifying data, for example by left-clicking the mouse when pointing at a 
selected classification presented in a selection box. In the case of cell cycle 
phase classification, the identifying data will be one of the following: GO, Gl, 
S, G2 and Mitotic (M), and may also include the phases of mitosis, prophase,' 
5 metaphase, anaphase, and telophase. 

The identifying data is received (step 5 in Figure 1) and then saved to a 
database (step 6 of Figure 1) in association with the selected cell's parameter 
set, to form classifying data. In this way, a database of classifying data, made 
up of parameter sets associated with identifying data, is built up for later use in 
10 automated classification. 

As noted above, in an embodiment of the present invention a parameter 
set made up of multiple parameter measurements may be represented or 
modeled as a vector in an n-dimensional feature space. Figure 13 shows a 
feature vector 350, representative of the parameter set of a cell, in a 3 
dimensional space. The space has three axes x, y and z and the feature vector 
350 has three dimensions x, y, z representative of three parameters, such as those 
listed above, and in the parameter set are assigned measurement values taken 
from the image of a cell. These measurement values are translated into the 
dimensions of the vector. 

Each cell identified from image data has identification data and an 
associated vector representative of the cell's parameter set. A sample of cells 
analysed according to the method of the present invention would result in a 
multiplicity of vectors occupying one feature space. If the parameters for which 
measurements are taken are indicative of the desired cell classification, the 
vectors will form clusters, indicating that the parameters are reliable classifying 
markers. The clusters may fill the entire feature space, and the borders between 
the clusters can be set to form decision boundaries. 

For example, in the embodiment of the invention in which the method is 
applied to analyse the cell cycle phases of a sample of cells, the parameters for 
which measurements were taken are all related to markers of cell cycle phase 
change, (e.g. cell-cycle phase specific protein phosphorylation, such as histone 
H3 phosphorylation). Hence, the vectors that represent each cell would cluster 
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according to the cell cycle phase of the cell from which the parameter set is 
derived. Consequently, for each different subpopulation identified in a sample 
(in the present example, 'prophase', 'metaphase', 'anaphase', 'telophase 9 , 'G2\ 
C S\.'G2 5 ), there would be a distinct cluster of overlapping vectors in the feature 
5 space modeled for the sample. 

Classification of further sets of cells 

Step 8 of figure 1 is the classification of a second set of cells based on 
the classifying data derived from user-led identification of a first set of cells. 
10 The process of saving previous user-led cell identifications in association with 
parameter sets derived from image data measurements is termed 'supervised 
learning.' 

As described above, individual cells are identified and the identifying 
data related to parameter sets which are in turn represented as vectors in a 

15 feature space. Cells that are identified by the user are considered to be the 
'training set', on which a system according to the present invention stores 
parameter sets in association with the user-entered identifying data, to derive 
classifying data. Once 'a suitably large database of classifying data has been built 
up from user identification of cells, the system may be instructed to perform 

20 unsupervised classification. As described with reference to step 8 of Figure 1, 
unsupervised classification consists of analysis of a second set of cells, the 
analysis involving division into object areas, and the taking of measurements in 
a manner similar to steps 1 and 2. The measurements are then analysed and the 
second set of cells are divided into subpopulations, on the basis of the 

25 measurements taken for cells in the second set of cells, by use of the cell 
classifying data. The second set of cells may be termed the 'unsupervised' set. 

The division into subpopulations may involve simple comparisons of the 
measurement values for a parameter. For example, if all mitotic cells in the 
training set had a measured value of a for parameter B, any cells to be classified 

30 in the unsupervised set that have a sufficiently similar measured value of a for 
parameter B will also be classed as mitotic. However, the measurements values 
of cells in the unsupervised set will rarely tally exactly with the measurement 
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values in parameter sets from the training set, due to normal biological variation. 
Therefore, in order to divide the unsupervised set into subpopulations, in an 
embodiment statistical techniques are employed to calculate which 
subpopulation that each cell in the unsupervised set should be classified in. 

If each parameter set derived from the training set is represented as a 
vector in a feature space, as described above, the feature vectors will cluster in 
the feature space according to the classification of the corresponding cell. Once 
trained, when a computer system implementing an embodiment of the invention 
is instructed to classify a set of cells, each cell is identified, analyzed as 
described above and a feature vector generated from a parameter set derive from 
its measurements. The feature vector of the cell is then analysed statistically in 
comparison with the clusters of feature vectors derived from the training set. 

In cases where more than one measurement is taken from the 
image for more than one parameter, one or more of the measurements may be 
weighted. By weighting the value of a measurement of a parameter, that 
measurement has a more significant effect on the outcome of the classification. 
For example, if the measurements of three parameters a, b and c are taken and a 
is known to be a more reliable indicator of cell cycle phase, the value of a may 
be altered to so that a has a proportionately greater effect on the results of an 
algorithm that uses the parameter set a, b, c to classify cells according to cell 
cycle phase. Weighting has the further advantage of minimizing classification 
error that may be caused by variation in the value of unweighted measurements. 
The weighting may take the form of multiplication of the value of the 
measurement. The parameter measurements may also be normalized to correct 
for parameters with dominant values. 

One method of statistical analysis is minimum distance classification. In 
this case, a cell from the unsupervised set will be classified into a subpopulation 
based on the minimum distance between its feature vector and the clusters. The 
cell to be classified is classified in the same subpopulation as the feature vectors 
that make up the cluster that is nearest to the feature vector. Figure 14 illustrates 
a feature space in which minimum distance classification may be implemented 
Figure 14 illustrates a two dimensional feature space with three clusters 300, 
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302 and 304 formed from the feature vectors of training sets. Each cluster is 
representative of a different subpopulation. Vector 306 (shown as a cross 
marking x and y values) is derived from the parameter set of a cell from an 
unsupervised set which is to be classified. Each cluster is modeled according to 
5 its centre of mass, represented as the mean feature vector for the cluster. In the 
case , of Figure 14, representing a : two dimensional feature space, the mean 
feature vector is at the geometrical centre of each cluster. The feature vector is 
classified according to the mean feature vector nearest to it (the minimum 
distance). In this case, the mean feature vector of cluster 300 is nearest to 

10 feature vector 306 and so the cell which feature vector 306 is representative of 
would be classified in the same subpopulation as cluster 300. 

In the case of a two dimensional feature space (ie. where only two 
parameters of the cell image data have been measured), calculation of the 
distance between two points is relatively straightforward. There are a number of 

15 techniques to measure the distance between two points in multi-dimensional 
space. These measures are known as similarity metrics. 

The most commonly used similarity metric is the Euclidean distance. If 
x\ and x 2 are two vectors whose similarity is to be checked then the Euclidean 
distance is defined as: 

20 i.-jtW -*?)"• 

The Euclidean distance measure has the property of giving greater 
emphasis to larger differences on a single parameter. The classification can thus 
be biased towards a parameter with dominant values. To overcome this 
problem, parameter measurements can be normalized and/or otherwise weighted 

25 using known statistical techniques prior to creating a feature vector. 

Alternatively, if speed of processing is a priority, the city block or 
interpoint distance metric may be implemented. The city block distance is also 
known as the absolute value distance or 'Manhattan' distance. The city block 
distance is computationally cheaper to calculate than the Euclidean distance. 

30 The city block distance is defined as: 
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1=1 

The Chebyshev distance metric is also computationally cheaper than the 
Euclidean distance. It is defined as: 



i I ! 

A vari ant of the Chebyshev distance known as the Mahalanobis distance 
may also be implemented. The Mahalanobis distance is defined as: 

The Mahalanobis distance metric has some useful properties. It 
automatically accounts for the scaling of the axes of the feature space, and 
corrects for correlation between related parameters (parameters whose 
measurement values are to a degree interdependent e.g. the concentration of 
products of genes which are co-regulated.) The Mahalanobis distance metric 
can also accommodate curved decision boundaries (borders between clusters), 
which is not possible with a Euclidean minimum distance metric, which is 
linear. However, computational requirements for the Mahalanobis distance 
metric grow quadratically with the number of parameters. 

In the case of a highly complex feature space, a minimum distance 
similarity metric such as those described above may not be able to adequately 
classify feature vectors into clusters. Parameter choice will affect the 
complexity of the feature space. 

As an alternative to minimum distance calculation in any form, a feature 
vector may be classified using the Bayesian maximum likelihood algorithm. 
This algorithm is a special case of the general Bayes' classification, based on 
Bayes' theorem. The subpopulations into which cells are classified are denoted 
G7,.,/ = l,2,....,m where m is the total number of classes. When trying to 
determine which subpopulation a cell represented by a feature vector at position 
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x in feature space belongs to one can define a conditional probability for each 
potential class: 

pfoilx) i = 1,2,... .,777. 
The feature vector x may be shown as a column vector of parameter 
measurements (feature 1, feature 2, up to feature n) that locates x in a 
multidimensional feature space,, for example: 

feature 1 



x = 



feature 2 



feature n 

The conditional probability ^(gtJx) gives the likelihood that the sample 
at position x belongs to class cr f . Classification can then be performed 
10 according to: 

x g xu i if pipr t |x) > p[mj |x) for all j * i 
i.e. the sample belongs to class sr. if ^(gt^x) is the greatest. 
The conditional probabilities in the above equation are initially 
unknown. However, if the training data set is available, a probability 
15 distribution function (PDF) for each type can be estimated. This PDF describes 
the chance of finding a feature vector from class tu i at position x . In general 

terms this further probability can be represented by/?(x|c7.). Therefore, for a 
feature vector at position x in multidimensional space, a set of probabilities can 
be computed that gives the relative likelihood that that feature vector belongs to 
20 a class m i4 

The desired p{^\x) and the available ^(x^) are related by Bayes' 
theorem: 

,where p(&i) is the a-priori probability that class xu l occurs in the 
25 image and p(x)is the probability of finding a sample of any class at location x . 
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Substituting the above equations gives the classification rule: 
xecr,. if p(^ i )p(v7 l )>p(^ m .) p ( mj ) for all 

With one modification for mathematical convenience, we can define the 
discriminant function g,(x): 

The classification rule can thus be restated as: 

x e ^ g, (x) > gj (x) for all j * i. 
The implementation of a Bayesian method for determining the 
classification of a feature vector has the advantage that multiple parameters may 
be used, increasing the granularity of classification by allowing division of cells 
into a greater amount of subpopulations. 

A quadratic or non-linear discriminant (QD) classifier may be used to 
classify cells into subpopulations. A QD classifier is described in Thomaz, C , 
Gillies, D.F., and Feitosa, R. Q., Proc. Post-ECCV Workshop on Biometric' 
Authentication (2002.) The QD classifier stipulates that an unknown feature 
vector x is assigned to the class or group / that minimizes a function dfr) 
dependent upon the true mean vector and the covariance matrix. This method of 
classification performs comparatively well with a limited amount of training 
data, and can also readily be used where measurements have been taken for a 
multiplicity of parameters. 

Further techniques classification that maybe employed either alone or in 
combination with the above techniques include multivariate Gaussian class 
models (for the evaluation of results from Bayesian classification), density 
estimation, and K-nearest neighbour classification (Therrien, C. W., Decision, 
estimation and classification, John Wiley & Sons, 1 989). 

Neural networks may also be implemented in order to classify within 
complex feature spaces. A neural network is a mathematical model for 
information processing based on the bioelectrical networks in the brain, which 
are formed by neurones and their synapses. In a neural network model, simple 
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nodes (or "neurons", or "units") are connected together to form a network of 
nodes - hence the term "neural network". 

The most common learning technique employed with neural networks is 
backpropagation. The output values are compared with the correct answer to 
5 compute the value of a predefined error- function. By various techniques the 
error is then fed back through the network. Using this information, an algorithm 
including the network adjusts the weights of each connection between nodes in 
order to reduce the value of the error-function by a small amount. After 
repeating this process for a sufficiently large number of training cycles the 

10 network will usually converge to some state where the error of the calculations 
is small. In this case one says that the network has learned a certain target 
function. To adjust weights properly a general method for nonlinear task 
optimization known as gradient descent may be applied. In this method, the 
derivation of the error-function with respect to the network connection weights 

15 is calculated and the weights are then changed such that the error decreases (thus 
going downhill on the surface of the error function). 

Creating a neural network that performs well, particularly in classifying 
examples that differ significantly from the training examples, often requires 
additional techniques. This is especially important for cases where only very 

20 limited numbers of training examples are available. The network may 'overfit* 
the training data by creating a statistical model of the data that has too many 
parameters, and thereby fail to capture the true statistical process generating the 
data. To counteract overfitting an 'early stopping heuristic' can ensure that the 
network will generalize well to examples not in the training set. It should also 

25 be noted that neural networks generally require a greater amount of training data 
than minimum distance metric classification methods. 

It will be understood that any of the above classification methods may be 
used individually or in combination with each other, in order to achieve 
optimum classification granularity. 

30 The above embodiments are to be understood as illustrative examples of 

the invention. Further embodiments of the invention are envisaged. 
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Note that the term "luminescence" as used herein is intended to include 
the phenomena of fluorescence and other types of luminescence such as 
chemiluminescence and phosphorescence. 

The cell cycle position of the cells may be determined in various 
alternative embodiments of the invention by monitoring the expression of the 
reporter molecule and detecting luminescence signals emitted by the reporter 
using an appropriate detection device! If the reporter molecule produces a 
fluorescent signal, then, either a conventional fluorescence microscope, or a 
confocal based fluorescence microscope may be used. If the reporter molecule 
produces luminous light, then a suitable device such as a luminometer may be 
used. Using these techniques, the proportion of cells expressing the reporter 
molecule may be determined. 

If the DNA construct contains translocation control elements and the 
cells are examined using a microscope, the location of the reporter may also be 
15 determined. 

In methods according to the present invention, the fluorescence of cells 
transformed or transfected with the DNA construct may suitably be measured by 
optical means in for example; a spectrophotometer, a fluorimeter, a fluorescence 
microscope, a cooled charge-coupled device (CCD) imager (such as a scanning 
imager or an area imager), a fluorescence activated cell sorter, a confocal 
microscope or a scanning confocal device, where the spectral properties of the 
cells in culture may be determined as scans of light excitation and emission. 

It is to be understood that any feature described in relation to any one 
embodiment may be used alone, or in combination with other features described, 
and may also be used in combination with one or more features of any other of 
the embodiments, or any combination of any other of the embodiments. 
Furthermore, equivalents and modifications not described above may also be 
employed without departing from the scope of the invention, which is defined in 
the accompanying claims. 



20 



55 



Claims 

1. A method of classifying cells into subpopulations using cell classifying 
data, the method comprising: 
5 receiving image data; 

analyzing said image data to identify object areas in the image data; 

analyzing said image data, on the basis of said identified object areas, to 
determine, for at least one selected first cell, one or more measurements; 

deriving a first parameter set for the first cell, the first parameter set 
1 0 comprising at least one of said one or more measurements; 

classifying a first set of cells, the process of classifying the first set of 
cells including classifying the first cell into a subpopulation and storing first 
identifying data indicating the subpopulation into which the first cell has been 
classified; 

15 deriving cell classifying data for use in classifying a second set cells into 

subpopulations from the first parameter set and the first identifying data, and 

classifying a second set of cells into subpopulations on the basis of one 
or more measurements taken for cells in the second set of cells, by use of the 
cell classifying data. 

20 

2. The method of claim 1, wherein the identifying data is cell cycle 
phase classifying data. 

3. The method of claim 1 or claim 2, wherein classifying the second 
25 set of cells comprises comparing the measurements for cells in the second set 

with the cell cycle classifying data derived from classification of the first set of 
cells. 

4. The method of claim 1 or claim 2, wherein classifying the second 
30 set of cells comprises calculating a statistical likelihood of each cell in the 

second set being a member of a classified group. 
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5. The method of any preceding claim, wherein a plurality of 
measurements are taken, and weighted in statistical importance. 

6. The method of any preceding claim, wherein applying said cell 
classifying data to a second set of cells further comprises generating cell cycle 
phase population data indicative of the relative sizes of said plurality of sub- 
populations in the selected cells. 

7. The method according to any preceding claim, further comprising 
performing the method with image data from a plurality of wells containing 
cells, the plurality of wells containing different test compounds. 

8. A method according to any preceding claim, wherein said object 
areas are identified using a process arranged to select both nuclear and 

15 cytoplasmic areas of a cell. 

9. A method according to any preceding claim, wherein said object 
areas include, for a selected cell, a first type of object area and a second type of 
object area, and wherein said one or more measurements include a first 

20 measurement determined using said first type of object area and a second 
measurement determined using said second type of object area. 

10. A method according to claim 9 or 10, wherein said first type of 
object area is identified using a process arranged to select a predominantly 

25 nuclear area of a cell. 

11. A method according to claim 9, wherein said second type of 
object area is identified using a process arranged to select a predominantly 
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cytoplasmic area of a cell. 



12. A method according to any preceding claim, wherein said one or 
more measurements include, for a selected cell, a first measurement determined 
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using an identified object area and a second measurement determined using an 
identified object area. 

13. A- method according to claim 12, wherein said first and second 
5 measurements are determined using the. same identified object area. 

14. . The method of any preceding claim, wherein cells of the first and 
second sets of cells comprise at least one first luminescent reporter, wherein said 
step of receiving image, data comprises receiving first image data created by 

10 detecting radiation emitted by said first luminescent reporter, and 

wherein said step of analyzing said image data to determine one or more 
measurements comprises analyzing said first image data. 

15. A method according to claim 14, wherein said step of analyzing 
said image data to identify object areas comprises analyzing said first image 
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data. 



16. 



A method according to claim 14 or claim 15 , wherein at least 
one cell in the first and second sets of cells further comprises a second 
20 luminescent reporter indicative of the location of a sub-cellular component in a 

cell. 

17. The method of claims 14, 15 or 16 wherein said step .of receiving 

image data comprises: 

a) receiving first image data created by detecting radiation emitted 

by said first luminescent reporter; and 

' b ) receiving second image data created by detecting radiation 

• emitted by said second luminescent reporter, 

■ wherein said step of analyzing said image data to identify object areas 

30 comprises analyzing said second image data, and 

wherein said step of analyzing said image data to determine one or more 
measurements comprises analyzing said first image data. 
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18. A method according to any of claims 14 to 17, wherein said one 
or more measurements include a measurement of a cytop.asmic luminescence 
:stg„a, mtensity, taken in an area generally corresponding to a cytoplasmic 
5 ; component of a selected cell. 

,19. A ™*odaccord,ng,oany„fcIaimsI4tol7,wherem S a 1 -done 
or more measurements include a measurement of a nuclear luminescence signal 

- 20. A method according to any of claims 14 to 19, wherein said step 
, of ana.yz.ng said image data to identify object areas comprises analyzing said 
first image data. s 

15 

cJ '.J 1 '' A meth ° d aCC ° rding to ™y P^eding claim wherein the 
cl ss^ng data is used in conjunction with an algorithm to classify a selected 
cell mto a selected first one of a plurality of sub-populations of cells. 

0 t 2Z , meth ° d ° f daim 21 * herein the algorithm takes into 

account a plurality of measurements in a parameter set." 

23. A method according to any.preceding claim, wherein said one or 
more measurements include one or more measurements selected from the group 
consisting of: . F 

/, a parameter relating t0 an average signal intensity within » identifed ' 
object area; 

■ a parameter relating to a fraction of pixe ls that deviate more than a 
gtven amount from an average sig„ al intensity within m ^ ^ 

* a parameter relating to the number of pixe.s with a signal intensity 
below a given threshold within an identified object area; 
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A, a parameter relating to a ratio between major and minor axes of an 
elliptical outline corresponding to an identified object area; 

R, a parameter relating to a maximum width of an identified object area; 
L, a parameter relating to an average width of an identified object area; 
C, a parameter relating to signal texture within an identified object area; 
M, a parameter relating to margihation in an identified object area. . 

24. The method of any preceding claim, wherein a second parameter 
set is derived from the one or more measurements taken for the second set of 
cells. 

25. The method of any preceding claim further comprising the 
modeling of a parameter set as a feature vector in an n-dimensional feature 
space, where n is equal to the number of parameters. 

26. The method of claim 25, wherein a feature vector representing 
the second parameter set and a feature vector representing the first parameter set 
occupy the same feature space. 

27. The method of claim 26, wherein a distance is calculated between 
the feature vectors. 

28. The method of claim 27, wherein the distance between the 
feature vectors is indicative of the classification of the feature vector 
representing the second parameter set. 

29. The method of any of claims 25 to 28, wherein a cell represented 
by a feature vector representing the second parameter set is classified according 
to a calculation of probability. 

30. The method of claim 29, wherein the calculation of probability 
comprises calculating the likelihood that the cell represented by the feature 
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vector representing the second parameter set is in the, same subpopulation as a 
cell represented by a feature vector representing the first parameter set, the 
calculation being based on the dimensions of the feature vectors. 

31. The method of claim 26, wherein- a neural network is applied to 

classify the cell represented by a feature vector representing the second 
parameter set with respect to the feature vector representing the first parameter 
set. 

32. A method according to any preceding claim, wherein said cells 
comprise a nucleic acid reporter construct, preferably a DNA construct, 
comprising a nucleic acid sequence encoding a detectable live-cell reporter 
molecule operably linked to and under the control of: 

i) at least one cell cycle phase-specific expression control 

element, and 

ii) a destruction control element. 



33. Apparatus arranged to perform the method of any preceding 
claim. . 



34. Computer software arranged to perform the method of any of 
claims 1-32. 



A data carrier storing the computer software of claim 34, 
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Abstract 

Method of, and computer software for; classification of cells into subpopulations 

5 A method of classifying cells into subpopulations using cell classifying 

data is described. The method comprises receiving and analyzing image data to 
. identify object areas in the image data to determine, for at least one selected first 
cell, one or more measurements. A first parameter set is derived from the 
measurements for the first cell, the first parameter set comprising at least one of 

10 said one or more measurements. The first set of cells are classified into 
subpopulations, and identified to produce first identifying data. Cell classifying 
data for use in classifying a second set of cells into subpopulations is derived 
from the first parameter set and the first identifying data. A second set of cells 
is classified into subpopulations on the basis of one or more measurements taken 

15 for cells in the second set of cells, by use of the cell classifying data. The 
parameter sets of cells may be represented as vectors in an n-dimensional space. 
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Figure 4a 
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Figure 11 





G0/G1/S 



G2 



Prophase Mitosis 



Figure 11 







Metaphase Anaphase Telophase Cytokinesis 



THIS PAGE BLANK (uspto) 



11/12 





THIS PAGE BLANK (uspto) 



12/12 




i— 

3 
O) 



THIS PAGE BLANK (uspto) 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 
jafGRAY SCALE DOCUMENTS 

,ef LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: __ 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



THIS PAGE BLANK (usfto) 



